Federated Learning Meets Blockchain in Decentralized Data Sharing Healthcare Use Case
Federated Learning Meets Blockchain in Decentralized Data Sharing Healthcare Use Case
Abstract—In the era of data-driven healthcare, the amalga- Index Terms—Blockchain, data sharing, Dataspace 4.0, decen-
mation of blockchain and federated learning (FL) introduces tralized data sharing, federated learning (FL), healthcare,
a paradigm shift toward secure, collaborative, and patient- Industry 4.0, Industry 5.0, IoE.
centric data sharing. This article pioneers the exploration of
the conceptual framework and technical synergy of FL and
blockchain for decentralized data sharing, aiming to strike a
balance between data utility and privacy. FL, a decentralized I. I NTRODUCTION
machine learning paradigm, enables collaborative AI model HE RAPID development of the Internet of Things (IoT),
training across multiple healthcare institutions without sharing
raw patient data. Combined with blockchain, a transparent
T cloud computing, and big data has led to Dataspace
4.0, a digital ecosystem where massive amounts of data
and immutable ledger, it establishes an ecosystem fostering
trust, security, and data integrity. This article elucidates the from various sources are seamlessly integrated and shared
technical foundations of FL and blockchain, unravelling their among stakeholders. Dataspace 4.0, funded by the European
roles in reshaping healthcare data sharing. This article vividly Union, aims to establish shared principles for exchanging
illustrates the potential impact of this fusion on patient care. manufacturing data at the EU level; Dataspace 4.0 is to pave
The proposed approach preserves patient privacy while granting
healthcare providers and researchers access to diversified data the way for a unified manufacturing data ecosystem and foster
sets, ultimately leading to more accurate models and improved the formation of a cohesive European community focused
diagnoses. The findings underscore the potential acceleration on Dataspace 4.0 [1]. Therefore, data sharing is essential in
of medical research, improved treatment outcomes, and patient Dataspace 4.0 to create a coherent European community and
empowerment through data ownership. The synergy of FL a unified industrial data environment. With the advent of the
and blockchain envisions a healthcare ecosystem that prioritizes
individual privacy and propels advancements in medical science. sixth generation (6G), the capabilities of Dataspace 4.0 are
expected to be further enhanced, providing new opportunities
for data-driven applications and services. Dataspace 4.0 refers
Manuscript received 27 November 2023; revised 5 February 2024; accepted to the next generation of data management systems expected
13 February 2024. Date of publication 19 February 2024; date of current to enable the integration and sharing of data across various
version 23 May 2024. This work was supported by the Science Foundation
Ireland under Grant SFI/12/RC/2289_P2. The work of Raushan Myrzashova industries and domains [2]. Varga et al. [3] discussed how
was suuported by the ANSO Scholarship for Young Talents. The work advanced technologies and the needs set for 6G affect Industry
of Xi Wei was supported by the Natural Science Foundation of Anhui 4.0 developments based on massive data. The foundation of
Province under Grant BJ2060000039. (Corresponding author: Saeed Hamood
Alsamhi.) Industry 4.0 is data sharing, which facilitates smooth commu-
Saeed Hamood Alsamhi is with the Insight Centre for Data Analytics, nication between entities, machines, and processes, improving
University of Galway, Galway, H91 TK33 Ireland, and also with the operational excellence, decision making, and resource usage.
Faculty of Engineering, IBB University, Ibb, Yemen (e-mail: Saeed.alsamhi@
insight-centre.org). Furthermore, Han et al. [4] provided a vision for a 6G
Raushan Myrzashova is with the School of Computer Science and industrial digital twin (DT) ecosystem to bridge the gaps
Technology, University of Science and Technology of China, Hefei 230026, between machines, humans, and data infrastructure to enable
Anhui, China (e-mail: [email protected]).
Ammar Hawbani and Liang Zhao are with the School of Computer numerous applications. As a result, data sharing is essential
Science, Shenyang Aerospace University, Shenyang 110136, China (e-mail: to achieving the full potential of Industry 4.0 and Dataspace
[email protected]; [email protected]). 4.0, not merely necessary.
Santosh Kumar is with the Department of Computer Science and
Engineering, International Institute of Information Technology-Naya Raipur, The safe and ethical sharing of private patient data is a
Atal Nagar-Nava Raipur 493661, India (e-mail: [email protected]). crucial challenge when healthcare data is expanding expo-
Sumit Srivastava is with the Department of Electronics and Communication nentially, and there is an increasing demand for data-driven
Engineering, FET, MJP Rohilkhand University, Bareilly 243001, India (e-mail:
[email protected]). medical advancements. Healthcare institutions, researchers,
Xi Wei is with the Department of Chemistry, University of Science and and patients need to strike a delicate balance between the
Technology of China, Hefei 230026, Anhui, China (e-mail: [email protected]). utility of aggregated medical data for scientific progress and
Mohsen Guizan is with the Machine Learning Department, Mohamed
Bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE (e-mail: the paramount importance of preserving individual privacy
[email protected]). and data security. The challenge has spurred the emergence
Edward Curry is with the Insight Centre for Data Analytics, University of innovative technologies poised to reshape the landscape
of Galway, Galway, H91 TK33 Ireland (e-mail: edward.curry@insight-centre.
org). of healthcare data sharing. Data sharing has become an
Digital Object Identifier 10.1109/JIOT.2024.3367249 essential component of modern society, enabling businesses,
2327-4662
c 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
ALSAMHI et al.: FEDERATED LEARNING MEETS BLOCKCHAIN IN DECENTRALIZED DATA SHARING 19603
governments, and individuals to access and analyze vast A. Motivation and Contributions
amounts of data for various purposes, such as research, Modern societies depend on data sharing because it
decision making, and innovation. However, centralized data- promotes cooperation, spurs innovation, and increases indus-
sharing systems have limitations, such as data privacy and try transparency [16]. Although it is essential to research,
security issues [5], interoperability issues [6], and single points development, and the welfare of society, the explosion in
of failure [7]. To address these challenges, decentralized data data generation—especially since the introduction of the 6G
sharing has emerged as a promising alternative that distributes network and the spread of the IoT—brings new difficulties.
data across multiple nodes or peers without needing a cen- Once shared, centralized data-sharing solutions now have
tral authority or intermediary. In addition, decentralized data privacy, security, and accessibility issues. To overcome these
sharing offers several benefits, such as increased privacy and obstacles, this article proposes a paradigm shift toward decen-
security, improved data ownership and control, and enhanced tralized data sharing by utilizing blockchain technology and
transparency and accountability [8]. FL. The synergy of blockchain and FL strategy guaran-
Decentralized data sharing is an essential aspect of tees enhanced security, privacy and a strong barrier against
Dataspace 4.0, as it allows multiple parties to share data unwanted access and possible data breaches. Furthermore, It
without needing a central authority or intermediary [9], lead- offers protection from changing cyber threats by sharing power
ing to improved collaboration, increased data privacy and
and leveraging blockchain’s advantages.
security, and the potential for new business models and rev-
Moreover, the synergy of FL and blockchain gives
enue streams. Several decentralized data-sharing technologies
stakeholders unparalleled control over data in addition to
and techniques, such as federated learning (FL) [10] and
security [17]. It creates an environment of trust and account-
blockchain [11], have emerged as promising solutions to
ability among players by protecting intellectual property rights
address these challenges. The technologies above have been
and promoting openness. At the vanguard of transforming
applied in various domains, such as healthcare, finance, and the
healthcare data exchange, the synergy strategy goes beyond
IoT, to address specific use cases and requirements. Two such
satisfying urgent needs. Safe, effective, patient-centered data
technologies, FL and blockchain, have garnered significant
sharing will speed up medical research, enhance patient care
attention for their potential to solve this conundrum. FL, a
and accelerate improvements in healthcare. Our proposed
decentralized machine learning (ML) approach that Google
paradigm stands out for resolving the conventional tradeoff
pioneered [12] offers a novel paradigm for collaborative
between privacy and data sharing. Not only does it comply
model training across a network of data sources without
with strict regulations, but it also dramatically increases pro-
centralizing raw data. It inherently safeguards data privacy
ductivity and openness in the healthcare industry. In addition
at its source, a crucial factor in healthcare, where data
to providing a comprehensive solution, our work establishes a
confidentiality is sacrosanct [13]. Initially developed as the
underlying technology for cryptocurrencies like Bitcoin [14], new benchmark for the interchange of healthcare information.
blockchain has transcended its financial origins to become The combination of blockchain technology and FL promises
a secure and immutable ledger capable of ensuring data to transform the healthcare industry by promoting scientific
integrity and transparency. Its characteristics are well suited breakthroughs, enhancing patient care, and guaranteeing legal
to address the need for trust and accountability in data- compliance.
sharing ecosystems [15]. Despite the potential benefits of Data sharing is pivotal in shaping modern societies, offering
decentralized data sharing, several challenges and limitations myriad benefits that span individuals, organizations, and the
are associated with the above technologies, such as scalability, broader community [16]. It fosters collaboration, drives
interoperability, and regulatory compliance. efficiencies, and fosters innovation across various sectors. Data
In this article, we explore the intersection between FL sharing enhances transparency and accountability, acting as a
and blockchain in the context of decentralized data sharing, bulwark against corruption and building trust among stake-
with a particular focus on the healthcare sector. Our objective holders [18]. It also streamlines resource utilization, leading
is to unravel the synergies between these two technologies, to significant cost savings and productivity gains. In public
shedding light on how they can be harnessed to revolu- services, data sharing catalyzes research and development,
tionize healthcare data sharing while preserving individual particularly in critical areas like healthcare, environmental
privacy and fostering collaboration. The significance of this conservation, and societal well-being. However, the landscape
article extends beyond theoretical exploration and embraces of data sharing is not without its complexities. With the pro-
practical implications for healthcare institutions, researchers, liferation of the IoT and the advent of the 6G network, there
and, ultimately, patients. The combination of blockchain has been an exponential increase in data generation, presenting
technology and FL has become a game-changer in the quickly both opportunities and challenges. Data sharing in this context
developing field of data-driven technologies, providing a fresh raises significant privacy, security, and interoperability con-
approach to decentralized data sharing. In the context of a cerns, necessitating a careful balance between innovation and
decentralized data-sharing framework, this article examines risk mitigation. Centralized data-sharing models, traditionally
the synergies between these two technologies, highlight- prevalent, are increasingly seen as inadequate due to their
ing how they could transform collaborative data sharing inherent privacy and security limitations, reliance on singu-
while protecting individual privacy and promoting smooth lar management entities, and accessibility challenges. This
collaboration. article argues for a shift toward decentralized data sharing,
19604 IEEE INTERNET OF THINGS JOURNAL, VOL. 11, NO. 11, 1 JUNE 2024
utilizing FL and blockchain technology. Such a decentral- options, and improved patient care, revolutionizing
ized approach leverages distributed computing for efficiency healthcare.
and scalability while harnessing blockchain’s strengths in 5) Bridging the Privacy-Utility Gap in Healthcare: Our
immutability and security. This method promises enhanced framework addresses the critical balance between col-
security and privacy, mitigating risks like unauthorized access laborative healthcare research and patient data privacy,
and data breaches. It also empowers stakeholders by granting aligning with stringent regulatory standards and enhanc-
greater control over data, fostering transparency, and safe- ing transparency and trust in healthcare data sharing.
guarding intellectual property rights. Additionally, it promotes
interoperability and seamless data exchange, thereby reducing
fragmentation and improving collaboration.
Our work is at the forefront of reshaping healthcare data B. Related Work
sharing by exploring the synergistic potential of FL and Industry 4.0 is characterized by integrating several cutting-
blockchain technologies. Our approach addresses the critical edge technologies, such as the Industrial IoT, artificial
needs of secure, efficient, and patient-centric healthcare data intelligence (AI)—including augmented intelligence, big data
sharing in a world increasingly driven by data. We propose analytics, ML, and deep learning (DL)—and edge–fog cloud
an innovative framework that enables healthcare institutions, computing. These technologies are driving the next phase of
researchers, and patients to share data securely and efficiently. digital transformation [28], [29], [30]. However, unlocking the
This approach not only enhances patient care and accelerates full potential of IIoT requires cross-company collaboration,
medical research but also promises greater accuracy in diag- such as multiparty computation, pooled analyses, data sharing,
noses, personalized treatment options, and rapid advancements and data exchanging within a network of collaborators or
in medical science. The primary driving force behind our work organizations, which is essential to overcome the significant
is the need to bridge the gap between collaborative healthcare fragmentation of data. Integrating FL, blockchain technology,
research and the imperative to protect patient data privacy. and healthcare data sharing has been an increasing interest
Our proposed decentralized data-sharing model effectively and research area. Numerous studies have examined the
resolves the traditional tradeoff between sharing and privacy. technologies individually and in conjunction to address the
It aligns with stringent regulatory requirements while boosting pressing challenges of healthcare data privacy, security, and
efficiency, transparency, and trust in the healthcare sector. collaborative research. Table I summarizes the comparison of
The main contributions of this article are encapsulated in the existing related work.
development of a groundbreaking, patient-centric framework FL in Healthcare: FL allows multiple parties to train an ML
for healthcare data sharing in the 6G era, integrating FL model collaboratively without sharing raw data. Liu et al. [31]
and blockchain technologies. This integration is poised to proposed an FL-based approach for decentralized data sharing
revolutionize the healthcare landscape, fostering advancements in the IIoT. The authors showed that their approach achieved
in research, improving patient care, and ensuring regulatory better accuracy and reduced communication overhead com-
compliance, all while maintaining a steadfast focus on patient pared to traditional centralized learning. However, FL still
privacy. We offer a comprehensive solution to decentralized faces challenges, such as the privacy–utility tradeoff and com-
data sharing, setting a new standard in healthcare information munication efficiency [32]. The combination of homomorphic
exchange. encryption and FL enables privacy-preserving healthcare data
1) Innovative Integration of Technologies: We propose a analysis, demonstrating the feasibility of collaborative model
novel framework combining FL and blockchain for training without exposing sensitive patient data [13]. The
healthcare data sharing. This synergy addresses complex challenges, methods, and prospects, including their applica-
challenges related to data security and efficient sharing, tions in the healthcare domain, are discussed in [33] and [34].
ensuring a patient-centric approach. Moreover, FL is a privacy-preserving paradigm in healthcare,
2) Privacy-Preserving Data-Sharing Model: Our work emphasizing its potential in medical research and the devel-
introduces a privacy-preserving framework for health- opment of diagnostic models [35].
care data sharing. By amalgamating FL’s capability to Blockchain in Healthcare: Blockchain is a decentralized
train models without exposing raw data and blockchain’s and tamper-proof ledger that records transactions and stores
strength in maintaining data integrity, we ensure patient data securely and transparently. Blockchain has been proposed
information’s confidentiality and immutability. as a potential solution for decentralized data sharing due
3) Empowerment of Patients in Data Sharing: The to its ability to provide data immutability, auditability, and
proposed model enhances patient empowerment by transparency. Makhdoom et al. [36] proposed a blockchain-
allowing them to maintain control over their healthcare based decentralized data-sharing framework that addressed
data. Blockchain technology enables patients to partici- data privacy and security concerns. Blockchain’s relevance in
pate in medical research while actively preserving data healthcare has been extensively investigated. Chen et al. [15]
ownership. examined the patient-centric blockchain model in healthcare,
4) Healthcare Use Case Application: We demonstrate highlighting its capacity for secure and transparent health
the practical applicability of our framework through data management and sharing. Fatima et al. [37] provided
a healthcare use case. This approach leads to a comprehensive review of blockchain’s role in healthcare
more accurate medical models, personalized treatment privacy and data security, focusing on its applications in
ALSAMHI et al.: FEDERATED LEARNING MEETS BLOCKCHAIN IN DECENTRALIZED DATA SHARING 19605
TABLE I
C OMPARISON OF E XISTING R ELATED W ORK
II. OVERVIEW
security and control over data access. Second, decentral-
A. Decentralized Data Sharing ized data sharing empowers individuals with heightened data
Decentralized data sharing refers to distributing data across privacy control. It eliminates the need for a central authority
a network of independent participants rather than relying on to manage data access, permitting individuals to grant access
a centralized authority to manage and control access to the exclusively to trusted parties. Within this framework, data
data. In a decentralized data-sharing system, each participant is distributed across nodes, safeguarded by cryptography.
has a copy of the data and is responsible for maintaining and Each entity holds a private key for data encryption and
updating their copy. In addition, participants share data with decryption, assuring data privacy and thwarting unauthorized
other participants, either directly or through a P2P network, access. This approach significantly augments personal data
and access data shared by other participants. Decentralized privacy and control, aligning with contemporary demands for
data sharing is designed with security and privacy in mind robust privacy measures to enable requirements of Industry 4.0
to protect against data breaches and unauthorized access toward Industry 5.0 [45]. Fig. 1 illustrates the architecture of
to sensitive information. Decentralized data sharing involves decentralized data-sharing using blockchain, including compo-
encryption, access controls, and other security measures to nents such as smart contracts, blockchain databases, and data
safeguard the data [39]. governance mechanisms.
Decentralized data sharing represents a groundbreaking Additionally, decentralized data sharing improves interop-
departure from traditional data-sharing approaches, offering erability across diverse systems and organizations. It achieves
many compelling advantages. Primarily, it fortifies data secu- this by embracing open standards and protocols that streamline
rity through its distributed structure, rendering it resistant data sharing among distinct platforms and applications. The
to targeted cyber-attacks or data breaches [40], [41]. Unlike result is reduced inefficiencies, redundancies, and delays in
centralized systems, where all data resides in a single location data exchange, facilitating seamless collaboration and resource
vulnerable to hacking [42], decentralized data sharing scatters optimization. Moreover, this decentralized approach enhances
data across a network of nodes, bolstering protection measures transparency by allowing all parties to access and vali-
with encryption and access controls. Each node possesses a date shared data, cultivating trust and collaborative potential.
private key [43], ensuring only intended recipients can access Decentralized data-sharing systems use encryption to protect
shared data, even if the network is compromised. Furthermore, the data from unauthorized access or tampering. Each node in
consensus algorithms verify data accuracy [44], fortifying the network has a private key used to encrypt and decrypt data,
19606 IEEE INTERNET OF THINGS JOURNAL, VOL. 11, NO. 11, 1 JUNE 2024
TABLE II
C OMPARING C ENTRALIZED AND D ECENTRALIZED DATA S HARING
ensuring that only the intended recipient can access the data to Advanced consensus mechanisms bolster this resilience, ren-
prevent unauthorized access to the data and provide a greater dering the system less susceptible to malicious attacks or data
level of security for the data. Therefore, decentralized data breaches [49]. Blockchain’s multifaceted potential is vividly
sharing improves resilience by creating a distributed network evident in various industries, including supply chain manage-
of nodes that continue to operate even if some nodes fail or are ment, healthcare, and financial services. It offers secure and
compromised and by using encryption to protect the data from transparent data recording and sharing capabilities, enhances
unauthorized access or tampering, leading to a reduction of the efficiency, accountability, and transparency, and presents novel
risks associated with data sharing and enabling organizations solutions to industry-specific challenges. While blockchain
to work more effectively and efficiently [46]. Table II outlines holds immense promise, it is essential to acknowledge and
the differences between centralized and decentralized data- address challenges, such as scalability, energy consumption,
sharing, emphasizing the superior resilience, privacy, and and regulatory frameworks, to fully harness its potential
interoperability of the latter. for decentralized data sharing across a spectrum of applica-
tions [49].
B. Blockchain Every node in a decentralized blockchain network has a
Blockchain technology is a formidable decentralized copy of the ledger. A new transaction is announced to the
and distributed data-sharing solution renowned for its network whenever one is proposed. The transaction is then
robust security and transparency features. Functioning as independently verified by nodes using pre-established proto-
a ledger system, it organizes data into immutable and cols and regulations. The consensus process’s primary goal
chronological blocks, authenticated through consensus mech- is to reach a consensus over the ledger’s current status. This
anisms among a network of nodes, ensuring its accuracy keeps any one node from intentionally or mistakenly changing
and timeliness [47]. The successful implementation of a the blockchain by requiring all nodes to verify and concur
blockchain-based decentralized data-sharing system hinges on on the sequence and legitimacy of transactions. Every node in
several key considerations. It must accommodate substantial the network is equal and cooperates to keep the blockchain
data volumes and transactions, necessitating high scalabil- current. These nodes divide up the transaction processing,
ity and performance. Robust security measures, including including consensus-building and validation. Blockchain’s
encryption and tamper-proofing, are vital to data integrity and decentralization guarantees that no single entity controls the
confidentiality. Additionally, the versatility to support various network. Rather, a democratic consensus is reached among
applications and use cases, spanning financial transactions, nodes through the consensus process. To enhance security
supply chain management, and digital identity verification, is and resilience, no single organization can dictate changes
paramount [14]. Blockchain’s essential attributes position it to the blockchain. Blockchain technology’s core feature is
as a pivotal player in the evolution of data-sharing systems, the distribution of processing among the network of nodes.
such as Dataspace 4.0 and 6G, offering a pathway to highly It guarantees that the system is resilient to attacks, strong,
secure, efficient, and transparent decentralized data-sharing and able to unite different people when trust is lacking. In
platforms [45], [48]. conclusion, the blockchain’s distributed bulk processing site
Blockchain technology’s prowess extends to enhancing highlights the decentralized character of consensus processes
interoperability in decentralized data sharing, offering a uni- across the nodes. This decentralized processing enhances the
fied framework for secure and effective interaction among blockchain’s security, transparency, and reliability.
diverse systems and organizations. Blockchain-based systems
facilitate secure data exchange while preserving data integrity C. Federated Learning
using common data structures and cryptographic algorithms. FL presents an innovative approach to ML that prioritizes
Transparency, another hallmark feature of blockchain, ensures collaborative model training while preserving data privacy and
all participants maintain a shared, comprehensive view of security [50]. In this decentralized paradigm, each partici-
data and its historical changes. It achieves this through a pating entity retains its data on its local device or network,
distributed ledger, creating an immutable, transparent record eliminating the need to transmit sensitive information to a
of all data transactions. This heightened transparency fosters centralized repository. FL operates by having each participant
trust among parties, promotes accountability, and ensures train an ML model using their local data and sending model
compliance. Furthermore, blockchain’s resilience factor is updates to a central server, reflecting the parameter differ-
crucial in decentralized data sharing, guaranteeing data ences post-training. The server aggregates these updates from
availability despite system failures or network disruptions. all participants, typically through algorithms like averaging
ALSAMHI et al.: FEDERATED LEARNING MEETS BLOCKCHAIN IN DECENTRALIZED DATA SHARING 19607
TABLE III
D ECENTRALIZED DATA S HARING W HEN B LOCKCHAIN M EETS FL
TABLE IV
A DVANTAGES OF U SING FL AND B LOCKCHAIN FOR D ECENTRALIZED DATA S HARING
local model training on the user’s device without transferring by allowing local model training on user devices, reduc-
data to a central server. ing the reliance on centralized servers. Using blockchain
Improved Privacy: By using blockchain to store data in with FL can increase security and privacy while ensuring
an encrypted and distributed manner, users can retain control a transparent and fair training process. However, it may
over their data and decide who can access it. FL can also also require additional computational resources and coordi-
improve privacy by allowing local model training on user nation between parties and may not always be necessary or
devices without centralized data collection. practical, depending on the specific use case. Table IV com-
Improved Interoperability: Blockchain and FL can enable pares FL with and without blockchain for decentralized data
interoperability between different systems and platforms, sharing.
allowing seamless data sharing across different networks. Fig. 3 shows the combination of FL and blockchain for
FL can also improve interoperability by aggregating locally decentralized data sharing. Data sources represent data sources
trained models across different devices and platforms. that can be used in Dataspace 4.0. These can include sensors,
Greater Transparency: The use of blockchain can provide devices, databases, and other sources. At the same time, FL
greater transparency in data sharing by providing an immutable represents the ML algorithms used for training models on
record of all transactions. The combination can further enhance distributed data. FL allows models to be trained without the
transparency by enabling users to verify the authenticity of need for centralized data storage. Data labeling and model
data and model outputs. FL can also improve transparency training represent the processes of labeling data and training
by allowing for the inspection of locally trained models by ML models on the labeled data. This process can be done
independent auditors. in a decentralized manner using FL. Blockchain consensus
Improved Resilience: The combination of blockchain and and validation of transactions represent the use of blockchain
FL can ensure that data and models are distributed across for consensus and validation of transactions in Dataspace 4.0.
a decentralized network of nodes, making the system more Blockchain provides a decentralized mechanism for validating
resilient to failures and attacks. FL can also improve resilience and verifying data transactions.
ALSAMHI et al.: FEDERATED LEARNING MEETS BLOCKCHAIN IN DECENTRALIZED DATA SHARING 19609
TABLE V
B ENEFITS OF D ECENTRALIZED DATA S HARING IN D IFFERENT I NDUSTRIES W ITHIN THE C ONTEXT OF I NDUSTRY 4.0
Fig. 6. Transactions between hospitals. decryption time, the total time complexity would be O(n ∗ d),
assuming the filter operation’s complexity is less than or equal
to O(n). After retrieving the data, standard data analysis or
research. This data set consists of 150 samples from three ML techniques can be applied to the decrypted data set.
species of Iris flowers (Iris setosa, Iris virginica, and Iris 4) Data Structure (Blockchain): Each hospital’s blockchain
versicolor). Four features were measured from each sample: can be represented as a sequence of blocks
the lengths and the widths of the sepals and petals. Given
its rich history in data analysis and ML, the Iris data set B = {b0 , b1 , b2 , . . . , bn } (2)
served as an ideal foundation for demonstrating the fea- where b0 is the genesis block and bn is the latest block. Each
sibility and effectiveness of our decentralized data-sharing block bi contains
mechanism.
1) Data Representation in Federated Learning: FL ensures bi = {T, h(bi−1 ), nonce} (3)
that the participating nodes, like hospitals, retain their local where:
data without exposing the raw data set to others. However, 1) T is the transaction data;
essential attributes or insights derived from the data might 2) h(bi−1 ) is a cryptographic hash of the previous block;
undergo encryption and be shared for collaborative learn- 3) nonce is a variable adjusted during the proof-of-work
ing. These shared attributes, rather than the actual data, get process.
recorded on the blockchain, ensuring transparency, security, The blockchain structure comprises a sequence of blocks,
and consistency. each linking to its predecessor through a hash. The complexity
2) Data Representation in Blockchain: While the actual of adding a new block involves calculating the hash and
data sets, like the Iris data set, do not leave the respective performing the proof of work, which has a complexity of
hospitals, specific data attributes are processed and then O(2k ) on average, where k is the number of bits required by
encrypted for sharing on the blockchain. Specifically, the the difficulty target D.
attributes of the Iris data set—sepal length, sepal width,
and petal length—are encrypted using the recipient’s pub-
C. Data Transaction
lic key. Additionally, the species label acts as metadata,
which is not encrypted, allowing for querying based on Given a message M, the encrypted message E for a recipient
species without requiring decryption. The complexity in with public key pk is
these sections is centered around data attribute encryption E = encrypt(M, pk). (4)
and decryption. The encryption process used for the Iris
data set attributes, like sepal length and width, is based The signature S using the sender’s private key sk is
on public-key cryptography. The time complexity for such S = sign(M, sk). (5)
operations typically depends on the critical size and the
algorithm used, often being polynomial concerning the key The data transaction process involves encryption and signing
length operations. Both operations are considered polynomial time
complexity based on the key sizes used for encryption and
T = {encrypt(sl , pk), encrypt(sw , pk) signing. The transmission complexity depends on network
encrypt(pl , pk), encrypt(pw , pk), species} (1) factors and is typically considered O(1) in the context of
algorithmic analysis. Algorithm 2 outlines the process for
where:
sending encrypted data and the corresponding digital signature
1) sl is the sepal length;
in a blockchain-based data transaction.
2) sw is the sepal width;
3) pl is the petal length;
4) pw is the petal width. D. Consensus Mechanism: Proof of Work
3) Data Retrieval and Analysis: To retrieve specific data The proof-of-work consensus mechanism aims to find a
attributes from the blockchain, we implement Algorithm 1. nonce such that
The retrieval algorithm’s complexity depends on the filtered
h(T, h(bi−1 ), nonce) < D (6)
data’s size and the decryption process’s efficiency. If n
represents the number of transactions and d represents the where:
19612 IEEE INTERNET OF THINGS JOURNAL, VOL. 11, NO. 11, 1 JUNE 2024
the competitiveness of companies. Second, decentralized data FL and blockchain, signalling a paradigm shift toward secure,
sharing promotes collaboration among participants, allowing collaborative, and patient-centric decentralized data sharing
them to work together to solve complex problems and develop in the data-driven healthcare era. The combination of FL’s
new solutions. This can lead to new business models, part- decentralized ML paradigm and blockchain’s transparent and
nerships, and ecosystems. Third, decentralized data sharing immutable ledger creates an ecosystem fostering trust, secu-
can facilitate the development of new technologies and appli- rity, and data integrity. While a specific real-world healthcare
cations, such as blockchain and edge computing, which can use case is not presented, this article vividly outlines the
further enhance the capabilities of Dataspace 4.0. Fourth, it potential impact of this fusion on patient care, emphasizing the
can lead to increased transparency and accountability, which is preservation of patient privacy alongside granting healthcare
particularly important in healthcare and finance, where privacy providers and researchers access to diverse data sets. The
and security are crucial. Finally, decentralized data sharing proposed approach promises to accelerate medical research,
can give individuals more control over their data, increasing improve treatment outcomes, and empower patients through
privacy and security. This can lead to the development of new data ownership. The synergy of FL and blockchain envisions a
services that provide individuals with more control over their healthcare ecosystem that prioritizes individual privacy, fosters
personal information. advancements in medical science, and sets the stage for a
The combination of BCFL for decentralized data sharing transformative shift in healthcare data sharing. This innovative
presents a unique and promising use case in healthcare, approach addresses the challenges of balancing data utility
particularly for remote monitoring applications. and privacy and opens avenues for more accurate models,
1) Remote Patient Monitoring (RPM): It involves tracking leading to enhanced diagnoses and ultimately contributing to
patient health data outside of traditional clinical settings. the evolution of a patient-centric and collaborative healthcare
This could include monitoring vital signs, blood sugar landscape.
levels, heart rate, or other relevant health metrics through
wearable devices or home-based equipment [59].
R EFERENCES
2) Collaborative Research and Treatment Optimization:
BCFL can facilitate collaborative research among dif- [1] “A common data space 4.0 for European manufacturing.”
DFA. Accessed: Oct. 16, 2023. [Online]. Available:
ferent healthcare entities while maintaining data privacy. https://digitalfactoryalliance.eu/moving-towards-a-common-data-space-
This collaboration can lead to more comprehensive 4-0-for-european-manufacturing/
health models, benefiting treatment optimization [60]. [2] “Dataspace 4.0.” Accessed: Jun. 23, 2023. [Online]. Available:
https://digitalfactoryalliance.eu/data-space-4-0-alliance/
3) Regulatory Compliance and Consent Management: [3] P. Varga et al., “Converging telco-grade solutions 5G and beyond to
Healthcare is a highly regulated sector, and BCFL can support production in industry 4.0,” Appl. Sci., vol. 12, no. 15, p. 7600,
aid in complying with regulations like HIPAA, GDPR, 2022.
[4] B. Han et al., “Digital twins for industry 4.0 in the 6G era,” 2022,
and others concerning patient data protection [61]. arXiv:2210.08970.
[5] T. White, E. Blok, and V. D. Calhoun “Data sharing and privacy issues
in neuroimaging research: Opportunities, obstacles, challenges, and
C. Future Directions monsters under the bed,” Hum. Brain Map., vol. 43, no. 1, pp. 278–291,
Advancing decentralized data sharing requires multi- 2022.
[6] A. Torab-Miandoab, T. Samad-Soltani, A. Jodati, and P. Rezaei-Hachesu,
faceted research efforts. Technical challenges, including data “Interoperability of heterogeneous health information systems: A sys-
integration, interoperability, and security, demand the devel- tematic literature review,” BMC Med. Informat. Decis. Mak., vol. 23,
opment of tailored algorithms and architectures. Legal and no. 1, p. 18, 2023.
[7] M. A. Uddin, A. Stranieri, and I. Gondal, “A survey on the adoption of
regulatory dimensions necessitate the exploration of frame- blockchain in IoT: Challenges and solutions,” Blockchain, Res. Appl.,
works safeguarding privacy amid data sharing. Investigating vol. 2, no. 2, 2021, Art. no. 100006.
the potential of decentralized data sharing in industries [8] V. Neumann et al., “Examining public views on decentralised health
data sharing,” Plos One, vol. 18, no. 3, 2023, Art. no. e0282257.
like healthcare and finance involves identifying domain- [9] E. Curry et al., “Data sharing spaces: The BDVA perspective,” in
specific use cases. Additionally, emerging technologies, such Designing Data Spaces: The Ecosystem Approach to Competitive
as blockchain and edge computing, require scrutiny for their Advantage, pp. 365–382. Cham, Switzerland: Springer Int. Publ., 2022.
[10] V. Pandi Chellapandi, A. Upadhyay, A. Hashemi, and S. H. Zak, “On
performance in decentralized contexts. Finally, developing the convergence of decentralized federated learning under imperfect
business models and ecosystems with incentives for collabora- information sharing,” 2023, arXiv:2303.10695.
tion is vital. Looking ahead, a focus on practical applications, [11] H. Niavis, N. Papadis, V. Reddy, H. Rao, and L. Tassiulas, “A
exemplified through case studies in healthcare partnerships, blockchain-based decentralized data sharing infrastructure for off-
grid networking,” in Proc. IEEE Int. Conf. Blockchain Cryptocurr.
aims to validate methodologies, address concerns about cen- (ICBC), 2020, pp. 1–5.
tralized control, and enhance flexibility for global applicability. [12] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas,
The commitment to refining and verifying these approaches “Communication-efficient learning of deep networks from decentralized
data,” in Proc. 20th Artif. Intell. Statist., 2017, pp. 1273–1282.
in real-world healthcare underscores a dedicated thrust for the [13] H. Fang and Q. Qian, “Privacy preserving machine learning with homo-
evolution of decentralized data sharing. morphic encryption and federated learning,” Future Internet, vol. 13,
no. 4, p. 94, 2021.
[14] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” in Proc.
VI. C ONCLUSION Decent. Bus. Rev., 2008, p. 21260
[15] H. S. Chen, J. T. Jarrell, K. A. Carpenter, D. S. Cohen, and X. Huang
This article has introduced a groundbreaking exploration “Blockchain in healthcare: A patient-centered model,” Biomed. J. Sci.
of the conceptual framework and technical synergy between Techn. Res., vol. 20, no. 3, 2019, Art. no. 15017.
ALSAMHI et al.: FEDERATED LEARNING MEETS BLOCKCHAIN IN DECENTRALIZED DATA SHARING 19615
[16] S. Alansari, “A blockchain-based approach for secure, transparent and [38] H. Fan et al., “Privacy preserving ultra-short-term wind power prediction
accountable personal data sharing,” Ph.D. dissertation, Faculty of Eng., based on secure multi party computation,” 2023, arXiv:2301.13513.
Sci. Math., University of Southampton, Southampton, U.K., 2020. [39] Y. Ye, L. Zhang, W. You, and Y. Mu, “Secure decentralized access
[17] S. H. Alsamhi et al., “Drones’ edge intelligence over smart environments control policy for data sharing in smart grid,” in Proc. IEEE
in B5G: Blockchain and federated learning synergy,” IEEE Trans. Green Conf. Comput. Commun. Workshops (INFOCOM WKSHPS), 2021,
Commun. Netw., vol. 6, no. 1, pp. 295–312, Mar. 2022. pp. 1–6.
[18] I. Jao et al., “Research stakeholders’ views on benefits and challenges [40] M. Sultana, A. Hossain, F. Laila, K. A. Taher, and M. N. Islam “Towards
for public health research data sharing in Kenya: the importance of trust developing a secure medical image sharing system based on zero trust
and social relations.” PLoS ONE, vol. 10, no. 9, 2015, Art. no. e0135545. principles and blockchain technology,” BMC Med. Informat. Decis.
[19] Y. M. Arif, H. Nurhayati, F. Kurniawan, S. M. S. Nugroho, and Mak., vol. 20, no. 1, pp. 1–10, 2020.
M. Hariadi “Blockchain-based data sharing for decentralized tourism [41] Y. Lu, X. Huang, Y. Dai, S. Maharjan and Y. Zhang, “Blockchain and
destinations recommendation system,” Int. J. Intell. Eng. Syst., vol. 13, federated learning for privacy-preserved data sharing in Industrial IoT,”
no. 6, pp. 472–486, 2020. IEEE Trans. Ind. Informat., vol. 16, no. 6, pp. 4177–4186, Jun. 2020.
[20] T. Guggenberger, A. Schweizer, and N. Urbach, “Improving interorga- [42] M. Stolpe, “The Internet of Things: Opportunities and challenges for
nizational information sharing for vendor managed inventory: Toward distributed data analysis,” ACM SIGKDD Explor. Newslett., vol. 18,
a decentralized information hub using blockchain technology,” IEEE no. 1, pp. 15–34, 2016.
Trans. Eng. Manag., vol. 67, no. 4, pp. 1074–1085, Nov. 2020. [43] R. E. Endeley, “End-to-end encryption in messaging services and
[21] M. Johnson, M. Jones, M. Shervey, J. T. Dudley, and N. Zimmerman, national security—Case of Whatsapp messenger,” J. Inf. Secur., vol. 9,
“Building a secure biomedical data sharing decentralized app (DApp): no. 1, p. 95, 2018.
Tutorial,” J. Med. Internet Res., vol. 21, no. 10, 2019, Art. no. e13601. [44] B. Lashkari and P. Musilek, “A comprehensive review of blockchain
[22] A. Balador, A. Bazzi, U. Hernandez-Jayo, I. de la Iglesia, and consensus mechanisms,” IEEE Access, vol. 9, pp. 43620–43652, 2021.
H. Ahmadvand, “A survey on vehicular communication for coop- [45] M. Pilkington, “11 blockchain technology: Principles and applica-
erative truck platooning application,” Veh. Commun., vol. 35, 2022, tions,” in Research handbook On Digital Transformations, vol. 225.
Apr. no. 100460. Cheltenham, U.K.: Edward Elgar Publ., 2016.
[23] M. Firdaus, S. Rahmadika, and K. H. Rhee, “Decentralized trusted data [46] E.-H. Diallo, O. Dib, and K. Al Agha. “A scalable blockchain-based
sharing management on internet of vehicle edge computing (IoVEC) scheme for traffic-related data sharing in VANETs,” Blockchain, Res.
networks using consortium blockchain,” Sensors, vol. 21, no. 7, p. 2410, Appl., vol. 3, no. 3, 2022, Art. no. 100087.
2021. [47] M. Pilkington, “Blockchain technology: Principles and applications,”
[24] P. Wang, W. Cui, and J. Li, “A framework of data sharing system in Research Handbook on Digital Transformations. Cheltenham, U.K.:
with decentralized network,” in Proc. 1st Int. Conf. (BigSDM), 2019, Edward Elgar Publ., 2016, pp. 225–253.
pp. 255–262. [48] H. Xu, P. V. Klaine, O. Onireti, B. Cao, M. Imran, and L. Zhang,
[25] O. Gallay, K. Korpela, N. Tapio, and J. K. Nurminen “A peer-to-peer “Blockchain-enabled resource management and sharing for 6G commu-
platform for decentralized logistics,” in Proc. Hamburg Int. Conf. Logist. nications,” Digit. Commun. Netw., vol. 6, no. 3, pp. 261–269, 2020.
(HICL), 2017, pp. 19–34. [49] N. Radziwill, “Blockchain revolution: How the technology behind
[26] S. Swetha and P. M. JoePrathap, “A study on a decentralized network bitcoin is changing money, business, and the world,” Qual. Manag. J.,
secured data sharing using blockchain,” in Proc. 1st Int. Conf. Comput. vol. 25, no. 1, pp. 64–65, 2018.
Sci. Technol. (ICCST), 2022, pp. 620–624. [50] J. Konecný, H. B. McMahan, F. X. Yu, P. Richtàrik, A. Theertha
[27] C. F. L. Hickman et al., “Data sharing: Using blockchain and decentral- Suresh, and D. Bacon, “Federated learning: Strategies for improving
ized data technologies to unlock the potential of artificial intelligence: communication efficiency,” 2016, arXiv:1610.05492.
What can assisted reproduction learn from other areas of medicine?” [51] K. Bonawitz et al., “Towards federated learning at scale: System
Fertil. Steril., vol. 114, no. 5, pp. 927–933, 2020. design,” in Proc. Mach. Learn. Syst., vol. 1, 2019, pp. 374–388.
[28] F. Firouzi, B. Farahani, and A. Marinšek, “The convergence and [52] T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V. Smith,
interplay of edge, fog, and cloud in the AI-driven Internet of Things “Federated optimization in heterogeneous networks,” in Proc. Mach.
(IoT),” Inf. Syst., vol. 107, Jul. 2022, Art. no. 101840. Learn. Syst., vol. 2, 2020, pp. 429–450.
[29] F. Firouzi et al., “Fusion of IoT, AI, edge–fog–cloud, and blockchain: [53] N. Rieke et al., “The future of digital health with federated learning.”
Challenges, solutions, and a case study in healthcare and medicine,” NPJ Digit. Med., vol. 3, no. 1, p. 119, 2020.
IEEE Internet Things J., vol. 10, no. 5, pp. 3686–3705, Mar. 2023. [54] Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning:
[30] B. Farahani, F. Firouzi, and M. Luecking, “The convergence of IoT and Concept and applications,” ACM Trans. Intell. Syst. Technol., vol. 10,
distributed ledger technologies (DLT): Opportunities, challenges, and no. 2, pp. 1–19, 2019.
solutions,” J. Netw. Comput. Appl., vol. 177, 2021, Art. no. 102936. [55] T. Nishio and R. Yonetani. “Client selection for federated learning with
[31] Y. Liu et al., “Deep anomaly detection for time-series data in Industrial heterogeneous resources in mobile edge,” in Proc. IEEE Int. Conf.
IoT: A communication-efficient on-device federated learning approach,” Commun. (ICC), 2019, pp. 1–7.
IEEE Internet Things J., vol. 8, no. 8, pp. 6348–6358, Apr. 2021. [56] R. Myrzashova, S. H. Alsamhi, A. V. Shvetsov, A. Hawbani, and X. Wei,
[32] Q. Meng, F. Zhou, H. Ren, T. Feng, G. Liu, and Y. Lin, “Improving “Blockchain meets federated learning in healthcare: A systematic review
federated learning face recognition via privacy-agnostic clusters,” 2022, with challenges and opportunities,” IEEE Internet Things J., vol. 10,
arXiv:2201.12467. no. 16, pp. 14418–14437, Aug. 2023.
[33] T. Li, A. Kumar Sahu, A. Talwalkar, and V. Smith, “Federated learning: [57] Z. Zhou, C. Guo, X. Zhang, R. Wang, L. Zhang, and M. Imran, “A
Challenges, methods, and future directions,” IEEE Signal Process. Mag., Blockchain-based data sharing marketplace with a federated learning use
vol. 37, no. 3, pp. 50–60, May 2020. case,” in Proc. IEEE Int. Conf. Blockchain Cryptocurr. (ICBC), Dubai,
[34] S. H. Alsamhi, A. V. Shvetsov, A. Hawbani, S. V. Shvetsova, S. Kumar, UAE, 2023, pp. 1041–1044.
and L. Zhao, “Survey on federated learning enabling indoor navigation [58] J. Bang and M.-J. Choi, “Design of personal data protection decen-
for industry 4.0 in B5G,” Future Gener. Comput. Syst., vol. 148, tralized model using blockchain and IPFS,” in Proc. 24st Asia–Pac.
pp. 250–265, Nov. 2023. Netw. Oper. Manag. Symp. (APNOMS), Sejong, South Korea, 2023,
[35] Y. Tian, S. Wang, J. Xiong, R. Bi, Z. Zhou, and M. Z. A. Bhuiyan, pp. 251–254.
“Robust and privacy-preserving decentralized deep federated learn- [59] D. K. Acharya, M. Shrivastava, and P. Padhi, “A decentralized
ing training: Focusing on digital healthcare applications,” IEEE/ACM blockchain-based IoT system for privacy-preserving data sharing,” in
Trans. Comput. Biol. Bioinformat., early access, Mar. 3, 2023, Proc. IEEE Int. Conf. Blockchain Distrib. Syst. Secur. (ICBDS),
doi: 10.1109/TCBB.2023.3243932. New Raipur, India, 2023, pp. 1–5.
[36] I. Makhdoom, I. Zhou, M. Abolhasan, J. Lipman, and W. Ni, [60] R. Song, B. Xiao, Y. Song, S. Guo, and Y. Yang, “A survey of
“PrivySharing: A blockchain-based framework for privacy-preserving blockchain-based schemes for data sharing and exchange,” IEEE Trans.
and secure data sharing in smart cities. Comput. Secur., vol. 88, Big Data, vol. 9, no. 6, pp. 1477–1495, Dec. 2023.
Jan. 2020, Art. no. 101653. [61] Y. Liu, P. Liu, W. Jing, and H. H. Song, “PD2S: A privacy-preserving
[37] N. Fatima, P. Agarwal, and S. S. Sohail “Security and privacy issues of differentiated data sharing scheme based on blockchain and federated
blockchain technology in health care—A review,” in ICT Analysis and learning,” IEEE Internet Things J., vol. 10, no. 24, pp. 21489–21501,
Applications. Singapore: Springer, pp. 193–201, 2022. Dec. 2023.