0% found this document useful (0 votes)
7 views13 pages

Doppel A BFT Consensus Algorithm For Cyber Physic 2024 Journal of Systems A

The document introduces Doppel, a Byzantine Fault Tolerance (BFT) consensus algorithm designed for Cyber-Physical Systems (CPS) that significantly reduces latency during the view-change process by utilizing two leaders per view. This approach allows for quicker request commitments, achieving latencies of 5𝛿 and 7𝛿, which are lower than the traditional PBFT algorithm's latency of 𝛥 + 6𝛿. Experimental results demonstrate that Doppel outperforms existing BFT protocols like PBFT and SBFT in terms of efficiency and effectiveness, making it suitable for real-time applications in various industrial domains.

Uploaded by

Gaurav Birajdar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views13 pages

Doppel A BFT Consensus Algorithm For Cyber Physic 2024 Journal of Systems A

The document introduces Doppel, a Byzantine Fault Tolerance (BFT) consensus algorithm designed for Cyber-Physical Systems (CPS) that significantly reduces latency during the view-change process by utilizing two leaders per view. This approach allows for quicker request commitments, achieving latencies of 5𝛿 and 7𝛿, which are lower than the traditional PBFT algorithm's latency of 𝛥 + 6𝛿. Experimental results demonstrate that Doppel outperforms existing BFT protocols like PBFT and SBFT in terms of efficiency and effectiveness, making it suitable for real-time applications in various industrial domains.

Uploaded by

Gaurav Birajdar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Journal of Systems Architecture 148 (2024) 103087

Contents lists available at ScienceDirect

Journal of Systems Architecture


journal homepage: www.elsevier.com/locate/sysarc

Doppel: A BFT consensus algorithm for cyber-physical systems with low


latency
Rui Hao a , Xiaohai Dai b , Xia Xie c ,∗
a School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan, 430070, China
b
Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science
and Technology, Wuhan, 430074, China
c
School of Computer Science and Technology, Hainan University, Haikou, 570228, China

ARTICLE INFO ABSTRACT

Keywords: The integration of blockchain technology with Cyber-Physical Systems (CPS) has gained significant attention
Byzantine fault tolerance across various industry domains such as manufacturing, healthcare, transportation, and energy management.
Byzantine generals The consensus mechanism serves as the fundamental component of decentralized blockchain systems, and
Consensus algorithm
the efficiency of the consensus algorithm greatly impacts the practicality of the entire system. One of the
Blockchain
most well-known representative algorithms in the field of Byzantine Fault Tolerance (BFT) consensus is PBFT,
Cyber physical systems
which operates through successive views where a replica is designated as the leader in each view. The leader
assumes the responsibility of proposing new requests and gathering votes from other replicas. However, in
cases where the leader becomes Byzantine or faulty, a view-change process is triggered after a 𝛥 timer expires,
transitioning to the next view. Requests that failed to commit in a previous view may experience a delay
of up to 𝛥 + 6𝛿 (𝛿 representing the actual network delay) before being committed in the new view. Given
that 𝛥 is typically set to a large value, this latency becomes unacceptable for CPS, which necessitates real-
time transaction processing or data exchange. Following PBFT, several works, including SBFT, have aimed to
improve consensus performance. However, all these works primarily focus on reducing latency in the normal
case and overlook the aforementioned issue that existed during the view-change process.
To address this issue, we propose Doppel, an approach that reduces the latency caused by a faulty leader.
Doppel achieves this by introducing two leaders, namely 𝐿1 and 𝐿2 , in each view and enabling request
commitment during the view-change process. These leaders are assigned different priorities. If 𝐿1 is non-faulty,
Doppel can commit requests within 5𝛿, which is equivalent to PBFT and SBFT. Conversely, if 𝐿1 is faulty but
𝐿2 is non-faulty, Doppel significantly reduces latency to 7𝛿, which is notably smaller than 𝛥 + 6𝛿 in PBFT
and SBFT. We have implemented a system prototype of and conducted multiple experiments to evaluate its
performance. The experimental results consistently demonstrate Doppel’s outperformance over PBFT and SBFT.

1. Introduction The blockchain is a digital ledger proposed by the pseudonymous


Satoshi Nakamoto in 2008 [14]. It is shared, immutable, and decentral-
A Cyber-Physical System (CPS) is a complex and multi-dimensional ized, and it addresses data trust issues by integrating several technolo-
system that integrates computing, networking, and physical environ-
gies, including peer-to-peer protocols, asymmetric encryption, consen-
ments. CPS plays a crucial and foundational role in Industry 4.0 [1],
sus mechanisms, and blockchain structures. The consensus mechanism
with applications across a variety of industrial domains, including
manufacturing [2–5], healthcare [6,7], transportation [8–10], and en- plays a crucial role in blockchain systems, as it resolves trust issues
ergy management [11,12]. However, CPS’s collaboration with diverse in distributed systems while ensuring data consistency and security
entities often creates security challenges in traditional centralized struc- across nodes. In recent years, Byzantine Fault Tolerance (BFT) consensus
tures, such as data breaches and cyber threats. Blockchain’s attributes protocols have gained significant attention [15,16] due to their abil-
like decentralization, traceability, and distributed consensus offer so- ity to maintain identical ledgers among mutually distrustful replicas
lutions for identity trust, behavior traceability, and secure data trans- [17,18]. These protocols are particularly suitable for CPS with high-
mission, making it increasingly adopted in both industrial and research
security requirements.
contexts for enhancing CPS security and efficiency [2,6,11,13].

∗ Corresponding author.
E-mail addresses: [email protected] (R. Hao), [email protected] (X. Dai), [email protected] (X. Xie).

https://doi.org/10.1016/j.sysarc.2024.103087
Received 29 June 2023; Received in revised form 7 February 2024; Accepted 7 February 2024
Available online 9 February 2024
1383-7621/© 2024 Elsevier B.V. All rights reserved.
R. Hao et al. Journal of Systems Architecture 148 (2024) 103087

Table 1
Consensus performance comparison. In PBFT-like protocols, 𝐿 and 𝐿′ represent the leader of the current and next view, respectively. In Doppel, 𝐿 and
𝐿′ represent two leaders of the current view. 𝛿 represents the actual network delay, while 𝛥 denotes the timeout parameter.
Latency Message complexity
Correct 𝛥 Correct 𝛥
Incorrect 𝛥a Incorrect 𝛥
Non-faulty 𝐿 Faulty 𝐿 & Non-faulty 𝐿′ Non-faulty 𝐿 Faulty 𝐿 & Non-faulty 𝐿′
PBFT-like protocols [21,22] 5𝛿 𝛥 + 6𝛿 / 𝑂(𝑛) 𝑂(𝑛2 ) 𝑂(𝑛2 )
Doppel (this paper) 5𝛿 7𝛿 / 𝑂(𝑛) 𝑂(𝑛2 ) 𝑂(𝑛2 )
a
We omit the latency measurement under the situation where 𝛥 is incorrectly set, since it is uncertain about the time consensus can be reached.

When contemplating the integration of blockchain technology in requests or QC data from either leader. We differentiate the QC data
CPS, there are three crucial factors that must be carefully considered. 𝑙𝑜𝑐𝑘 (𝑐𝑜𝑚𝑚𝑖𝑡) from two leaders as 𝑙𝑜𝑐𝑘1 (𝑐𝑜𝑚𝑚𝑖𝑡1 ) and 𝑙𝑜𝑐𝑘2 (𝑐𝑜𝑚𝑚𝑖𝑡2 ),
Firstly, the inherent openness of both network and physical systems respectively. When 𝐿1 is non-faulty, replicas receive the 𝑐𝑜𝑚𝑚𝑖𝑡1 for
exposes the CPS to a spectrum of challenges, including cyber–physical each request 𝑟𝑒𝑞 and proceed to commit 𝑟𝑒𝑞, following a similar process
attacks, malware injection, denial-of-service attacks, and other mali- to PBFT-like protocols, resulting in a consensus latency of 5𝛿. On the
cious threats [19,20]. Secondly, the communication network within other hand, if 𝐿1 is faulty while 𝐿2 is non-faulty, the view-change
CPS, especially in wireless environments, is marked by instability, process is triggered as soon as the replicas receive 𝑐𝑜𝑚𝑚𝑖𝑡2 , which
which presents issues related to reliability, latency, and bandwidth requires five communication rounds. During this process, 𝑟𝑒𝑞 can be
constraints. Finally, many CPS applications demand a timely system re- committed in the second round of the view-change process, resulting in
sponse, which necessitates low-latency performance, especially in quick a consensus latency of 7𝛿. This latency is significantly lower than the
decision-making scenarios such as autonomous vehicles, healthcare 𝛥 + 6𝛿 latency in PBFT-like protocols. Table 1 provides a performance
monitoring, and industrial automation. Consequently, when employing comparison between Doppel and PBFT-like protocols, highlighting the
blockchain technology in CPS, it is essential to prioritize security, lower latency achieved by Doppel. Additionally, Doppel maintains the
stability, and low-latency considerations. same message complexity as PBFT-like protocols, ensuring efficient
Nevertheless, Practical Byzantine Fault Tolerance (PBFT) [21], the communication overhead.
prominent consensus algorithm in blockchain, faces significant chal- It is worth noting that although the view-change protocol in Dop-
lenges associated with high latency when deployed in CPS under in- pel involves two additional communication rounds, it does not result
secure or unstable communication networks. Specifically, elevated la- in a longer process duration. In PBFT-like protocols, when the leader
tency issues arise when the leader is attacked to become Byzantine, is faulty, it takes 𝛥 + 2𝛿 to complete the view-change process. In
or when messages are delayed due to volatile wireless network envi- contrast, in Doppel, when the first leader is faulty, it takes 9𝛿 to
ronments. These challenges are prevalent in CPS and can substantially complete the view-change process. Considering that the value of 𝛥
degrade system performance, constraining the applicability of PBFT in is typically significantly larger than 𝛿, the latency of 9𝛿 tends to be
CPS systems that demand higher real-time performance. smaller than 𝛥 + 2𝛿. Furthermore, it is important to note that while we
In PBFT, the consensus process progresses through successive views, present Doppel built on top of PBFT, it can be easily adapted to other
with one replica designated as the leader for each view. The leader leader-based BFT protocols, such as HotStuff [23], by incorporating two
proposes new requests by broadcasting them and also generates and leaders within a view. This flexibility allows our consensus protocol to
broadcasts quorum certificate (QC) data. Under normal operation, re- be widely applicable in blockchain-enabled cyber–physical systems.
quests can be committed after five communication rounds, resulting To assess the performance of Doppel, we have implemented a
in a consensus latency of 5𝛿 (where 𝛿 represents the actual network system prototype and conducted a comparative analysis with PBFT and
delay). To handle scenarios where the leader becomes Byzantine or SBFT from various perspectives. Firstly, when the leader (𝐿1 in Doppel)
faulty, PBFT incorporates a timeout mechanism. At the start of each is non-faulty, Doppel achieves a similar latency to PBFT and SBFT. This
view, a timer with a parameter 𝛥 is initiated and reset upon reaching demonstrates that our protocol can effectively match the performance
an agreement on a request. If the timer expires, the view switches to of PBFT-like protocols under normal operating conditions. Secondly, in
the next using the view-change protocol, and a new leader is selected. cases where the leader of the current view (𝐿1 in Doppel) is faulty,
The view-change protocol involves two communication rounds, during but the leader of the next view (𝐿2 in Doppel) is non-faulty, Dop-
which the new leader decides and proposes requests for positions that pel outperforms PBFT and SBFT by significantly reducing latency. This
may have been proposed in previous views. If a client’s request, denoted improvement is attributed to the ability of Doppel to commit requests
as 𝑟𝑒𝑞, was sent in the previous view and the leader is faulty, it takes 𝛥 during the view-change process, resulting in a much lower overall
to trigger the timeout, two communication rounds to switch the view consensus latency. Furthermore, Doppel exhibits faster view-change
(including the proposal of 𝑟𝑒𝑞), and an additional four communication process completion, enabling the activation of a new view more rapidly
rounds for 𝑟𝑒𝑞 to be committed. Consequently, the consensus latency than PBFT-like protocols. This reduction in view-change time enhances
for 𝑟𝑒𝑞 is 𝛥 + 6𝛿, which can be considerably high. the overall responsiveness of the system. Through these performance
Several studies have attempted to reduce the consensus latency after evaluations, we demonstrate the superior efficiency and effectiveness
PBFT, such as SBFT [22]. Despite their notable progress, their emphasis of Doppel compared to PBFT and SBFT in various scenarios.
remains on reducing the normal-case latency, leaving them susceptible This study offers the following notable contributions:
to the substantial latency issue inherent in the view-change process.
In the rest part of this paper, we will use the term ‘‘PBFT-like’’ to • Identification of the Long Latency Issue: We identify and highlight
succinctly refer to these protocols. the long latency problem present in the view-change process. By
Acknowledging that the view-change process significantly contri- recognizing this limitation, we pave the way for addressing the
butes to latency, our proposed protocol, Doppel, seeks to alleviate issue and improving the overall performance of BFT consensus
consensus latency for cross-view requests by introducing the concept protocols.
of dual leadership within a view. This innovation allows for the com- • Proposal of Doppel: To mitigate the consensus latency problem,
mitment of requests even during the view-change process. In Doppel, we introduce Doppel, a novel BFT protocol. Doppel addresses the
two replicas are designated as leaders within a view, denoted as 𝐿1 latency issue by incorporating the concept of running two leaders
and 𝐿2 , with distinct priorities. Both 𝐿1 and 𝐿2 propose requests and within a view. This innovation enables more efficient and timely
broadcast quorum certificate (QC) data, and every replica votes for consensus in BFT systems.

2
R. Hao et al. Journal of Systems Architecture 148 (2024) 103087

• Extensive Experimental Evaluation: We evaluate the performance


of Doppel through a series of comprehensive experiments. The
results of these experiments provide empirical evidence of the
feasibility and efficiency of our proposed protocol.1

The remainder of this paper is structured as follows. In Section 2,


we provide the necessary background information and discuss the
motivation behind our work. Section 3 presents the design philosophy
of Doppel, detailing its key features and mechanisms. In Section 4,
we conduct a comprehensive theoretical analysis of Doppel, covering
safety analysis and liveness analysis. The practical implementation,
experiment settings, and evaluation results are presented in Section 5.
Section 6 offers a review of the relevant literature and related work in
the field. Finally, Section 7 concludes the paper.

2. Background & motivation

2.1. BFT consenesus

Fig. 1. Overview of the normal-case protocol in PBFT.


The consensus protocol plays a crucial role in distributed systems by
facilitating the coordination among different replicas to achieve agree-
ment [24]. Consensus can be broadly classified into two categories:
crash-fault-tolerance (CFT) and Byzantine-fault-tolerance (BFT). Repli- Asynchronous networks avoid timing assumptions, but the determin-
cas in a BFT consensus can exhibit behaviors beyond being crashed istic consensus is deemed impossible [28]. As a result, most protocols
or correct, including being Byzantine or malicious [25]. As a result, resort to randomness for probabilistic outcomes [29,30], introduc-
replicas in a BFT consensus are categorized into two groups: Byzantine ing higher overhead and reduced performance even under favorable
and non-faulty. Byzantine replicas, in addition to being non-responsive network conditions.
(crashed), can intentionally provide incorrect or deceptive messages. Partial synchrony, as an intermediary concept between synchronous
While BFT consensus may have limited applicability in traditional data and asynchronous assumptions, is widely regarded as a more realistic
center scenarios, it has gained significant attention in the past decade and practical assumption in consensus research [27]. In simple terms,
owing to the widespread adoption of blockchain technology [26]. In a a partially synchronous network assumes that although the network
blockchain system, replicas are operated by mutually distrustful parties, may exhibit asynchrony at times, there exists a Global Stabilization Time
some of whom may engage in malicious behavior. Thus, the use of BFT (GST) after which the network becomes synchronous for an extended
consensus becomes essential in such contexts. duration. Numerous notable consensus protocols, such as PBFT [21]
To be more specific, the consensus protocol serves the purpose of and HotStuff [23], are built upon the partially synchronous assump-
maintaining a consistent ledger across multiple replicas. This ledger tion due to its balance between performance and feasibility. PBFT, in
consists of a sequence of ordered requests, with each request being particular, has long been considered a de facto standard for Byzantine
submitted by a client. A request is considered committed when it Fault Tolerant consensus. It has undergone rigorous analysis and has
is unanimously agreed upon through the protocol. Upon reaching a proven to be an influential protocol in the field.
consensus, each agreed-upon request is allocated a position within the
ledger, indicating its designated execution order. 2.3. PBFT-like protocols
A correct BFT consensus protocol must adhere to the following two
fundamental properties: At a high level, the PBFT-like protocols proceed in successive views,
in each of which a leader is designated to propose requests by assigning
• Safety: If two replicas commit two different requests, denoted
positions to requests. If everything goes well, requests can be commit-
as 𝑟𝑒𝑞 and 𝑟𝑒𝑞 ′ , at the same position, then 𝑟𝑒𝑞 and 𝑟𝑒𝑞 ′ must be
ted in a view continuously. Otherwise, a timeout event is triggered,
identical.
and the view is changed to the next. The PBFT-like protocols can be
• Liveness: If a request 𝑟𝑒𝑞 is submitted to the protocol, it will
divided into two main components: the normal-case protocol and the
eventually be committed.
view-change protocol. The normal-case protocol defines the rules and
procedures within a view, while the view-change protocol specifies
2.2. Timing assumption
how the transition to the next view occurs. The original normal-case
In a distributed system where different replicas are interconnected protocol proposed by Castro et al. [21] consists of three distinct phases,
by unstable network links, accurately characterizing the network con- where each replica broadcasts messages during the last two phases. This
ditions is crucial as it significantly impacts the design of the consensus results in a quadratic message complexity, which can limit scalability
protocol. One effective way to characterize the network is by establish- in larger systems. To address this scalability challenge, recent research
ing various assumptions about the timing involved in sending messages efforts have focused on reducing the message complexity to linear by
between replicas. In the consensus field, three primary types of timing employing the leader as a central point for collecting and dispatching
assumptions are prevalent: synchronous, asynchronous, and partially messages. This approach, adopted in works such as SBFT [22], has sig-
synchronous [27]. nificantly improved the scalability of the PBFT protocol. Consequently,
In synchronous networks, the timing assumption, governed by the we directly describe the linear variant of the normal-case protocol in
delivery of messages within a specified timeframe 𝛥, poses challenges this section and refer to it as the normal-case protocol for brevity
in finding the right balance. An inadequately small 𝛥 risks safety throughout the rest of our paper.
issues, while an overly large one hampers consensus performance.
2.3.1. Normal-case in PBFT-like protocols
Within a view, the leader will keep proposing requests. Without loss
1
https://github.com/iris-42/Doppel of generality, we assume the leader proposes one request every time.

3
R. Hao et al. Journal of Systems Architecture 148 (2024) 103087

event is triggered, and 𝑟𝑒𝑞 is reproposed in the ‘‘new-view’’ message.


If the leader of the new view is non-faulty, it takes an additional 6𝛿
time for 𝑟𝑒𝑞 to be committed in the new view (as illustrated in Fig. 2).
Requests that have been proposed but fail to be committed within
a particular view are referred to as "cross-view requests’’. Given the
existence of cross-view requests and their extended consensus latency, a
natural question arises: Can we reduce the consensus latency of cross-view
requests?

3. Design of doppel

In this section, we present the model and definitions of Doppel.


After providing an overview, we elaborate on the two parts of the pro-
tocol, namely the normal-case protocol and the view-change protocol,
respectively.

3.1. Model & definitions


Fig. 2. Overview of the view-change protocol in PBFT, where phases 2–1, 2–2, 3–1,
{
and 3–2 deal with the requests proposed in the new-view messages. The system consists of 𝑛 ≥ 3𝑓 + 1 replicas, denoted by 𝑝𝑖 ∣ 0 ≤ 𝑖 ≤
3𝑓 }, where at most 𝑓 of them are Byzantine. The Byzantine replicas
are all coordinated by an adversary. Each pair of replicas is con-
nected through a reliable network link, where a message sent from
As shown in Fig. 1, the leader 𝑝0 proposes a request by broadcasting a non-faulty replica will eventually be delivered by another honest
pre-prepare messages in the first phase (phase 1). Upon receiving a valid replica. We assume a partially synchronous network, as in PBFT-like
pre-prepare message, each replica will vote for this request by creating a protocols, wherein the network becomes synchronous after GST. We
partial threshold signature and sending it to 𝑝0 (phase 2–1) through the further assume that a clock synchronization tool can be accessible
prepare messages. After receiving 𝑛−𝑓 prepare messages (given the total after GST, which helps all the replicas possess a roughly identical
number of replicas being or larger than 3𝑓 + 1), 𝑝0 combines the partial clock. Note that this clock synchronization tool does not affect the
threshold signatures as a complete threshold signature, which formats safety of our protocol. It only influences the liveness if the clock
a quorum certificate (QC). 𝑝0 will then broadcast QC through the lockQC synchronization tool is inaccessible, which is similar to the assumption
messages (phase 2–2). A replica that receives a valid lockQC message of a partially-synchronous network.
will store the QC as the lock data for the request and further vote for We assume that a Public Key Infrastructure (PKI) is established among
it by sending the partial threshold signature on the lock data (phase the replicas, where the public keys of all the replicas are known to
3–1). 𝑝0 will create a complete threshold signature after receiving each one while the private key is maintained secretly by each replica.
𝑛 − 𝑓 commit messages and broadcast the complete threshold signature We also assume a Threshold Signature Infrastructure (TSI) in the system,
through commitQC. A replica that receives a valid commitQC message where the threshold value is set as 𝑛 − 𝑓 . A collision-resistant hash
will store the QC and commit the request (phase 3–2). function is also required, which can map arbitrary data to a fixed-size
Latency: It is easy to find that if everything goes well, a request can output. The adversary is constrained to possess limited computation
be committed within 5𝛿, where 𝛿 denotes the actual network delay. capability, meaning that it cannot break PKI, TSI, or the hash function.
As a consensus algorithm, Doppel is used to consistently order
2.3.2. View-change in PBFT-like protocols requests among all non-faulty replicas. Requests are sent from clients to
If a replica cannot commit a message within a pre-defined period leaders of the current view. We omit the detailed process of interaction
𝛥, the timeout event will be triggered, and the replica will broadcast a between clients and leaders since it is not a critical part of the consensus
view-change message, as shown in Fig. 2. The view-change message con- algorithm. We assume the leaders are simply selected in a round-robin
tains all the lock data stored by the replica, each of which corresponds manner.
to a request at a position. After receiving 𝑛−𝑓 view-change messages, the
3.2. Overview of doppel
leader of the next view (𝑝1 in Fig. 2) will process the lock data contained
in these messages and broadcast a new-view message. In the new-view
Similar to PBFT-like protocols, Doppel also operates in successive
message, 𝑝1 will propose some requests in case they have been proposed
views. It includes two parts: the normal-case protocol and view-change
in the previous view. If there is a lock data for request 𝑟𝑒𝑞 at position 𝑘,
protocol, which stipulate the rules in a view and the rules to change
𝑝1 will also repropose 𝑟𝑒𝑞 at 𝑘 in the new-view message. Otherwise, 𝑝1
views, respectively. If the view-change process is triggered, Doppel en-
can propose an arbitrary request at 𝑘 as it wishes. The new-view message ables a request to be committed through the view-change or new-view
will also contain all the view-change messages as proof. A replica that messages directly without undergoing the additional phases (phase 2–
receives a valid new-view message can enter the next view successfully. 1, 2–2, 3–1, and 3–2 in Fig. 2). Besides, Doppel designates two leaders
For each request proposed in the new-view message, the replica will in a view, as opposed to one leader in PBFT-like protocols. Both leaders
process the request as in the normal-case protocol’s latter phases. After can propose new requests. As long as one leader is non-faulty, requests
broadcasting the new-view message, 𝑝1 runs the normal-case protocol can be committed in the current view or through the view-change or
by broadcasting pre-prepare messages. new-view messages, thus reducing the consensus latency. After the view
Latency: If the leader is Byzantine or faulty and view-change is change is completed, the second leader in the previous view becomes
triggered, the request that failed in the previous view experiences a the first leader in the new view, and another replica is designated as
delay of up to 𝛥 + 6𝛿 before being committed in the new view. the second leader in the new view.
We have not described the checkpoint mechanism, high water
2.4. Motivation marks, and low water marks in this paper since they are pretty much
the same as PBFT [21]. Also, we have not included the means to
Based on the description of the PBFT-like protocols, when the leader facilitate liveness, such as an increase in the timer value and early
becomes corrupted, a request 𝑟𝑒𝑞 cannot be committed within the cur- broadcast of view-change messages, which have been detailed in the
rent view. After the expiration of a timeout period of 𝛥, a view-change PBFT paper [21].

4
R. Hao et al. Journal of Systems Architecture 148 (2024) 103087

broadcasting a pre-prepare message. Therefore, for a position, two lead-


ers can propose the same request simultaneously. However, since the
network conditions related to each leader may differ, and leaders may
break down, the agreement processes led by two leaders may advance
at different paces. In the rest of this section, we will discuss various
scenarios of two agreement processes and describe the corresponding
rules for each.
We introduce two timing parameters in Doppel: 𝛥1 and 𝛥2 . The
value of 𝛥1 is set in the same way as in PBFT-like protocols. A timer
of 𝛥1 is started at the beginning of a view and restarted every time
a request is committed in the normal case. The situations of two
agreement processes can be divided into four categories, as depicted
by Fig. 4, which are described as follows:

• Case 1: If a replica receives a commit message from 𝐿1 before


the timer of 𝛥1 expires, it can commit the request proposed by 𝐿1
directly.
• Case 2: If a replica receives a commit message from 𝐿2 before
the timer of 𝛥1 expires and it does not receive 𝑙𝑜𝑐𝑘1 , then it will
immediately broadcast a view-change message.
• Case 3: If a replica receives a commit message from 𝐿2 before
the timer of 𝛥1 expires and it receives 𝑙𝑜𝑐𝑘1 , then it will start a
new timer of 𝛥2 , which equals an estimated network delay of two
communication rounds. After either 𝛥1 or 𝛥2 timer expires, it will
broadcast a view-change message.
• Case 4: If the timer of 𝛥1 expires, a replica will broadcast a
view-change message.

Latency: From the rules above, we can find that 𝐿1 has a higher
priority than 𝐿2 . In case 1, the replica can commit the request from
𝐿1 directly. Therefore, in good conditions where the network is stable,
and 𝐿1 is non-faulty, a request from 𝐿1 can be committed in 5𝛿 without
Fig. 3. Overview of the normal-case protocol in Doppel. being affected by the additional leader (i.e., 𝐿2 ), which is the same
as PBFT-like protocols. In cases 2, 3, and 4, the view change process
will be launched, and the request from 𝐿2 cannot be committed in the
3.3. Normal-case protocol current view directly. Instead, it will be committed during the view-
change process, which will be detailed in the next section. If a commit
Fig. 3 depicts the normal-case protocol in Doppel, where Fig. 3(a) message from 𝐿2 is received first, the replica takes actions based on
details the different messages while Fig. 3(b) briefly describes the whether 𝑙𝑜𝑐𝑘1 is received. If it is false, the replica judges 𝐿1 as faulty
protocol. We denote the two leaders as 𝐿1 and 𝐿2 , respectively, which and triggers the view change immediately. Otherwise, the replica will
correspond to 𝑝0 and 𝑝1 in Fig. 3(a). 𝐿1 has a higher priority than 𝐿2 . start a timer of 𝛥2 and wait for its expiration. The intuition of setting
Both 𝐿1 and 𝐿2 keep proposing requests for each position. A request 𝛥2 as an estimated network delay of two communication rounds is that
in Doppel is in the format of ⟨𝑝𝑜𝑠, 𝑒, 𝑟𝑒𝑞⟩, where 𝑒 ∈ {1, 2} represents it takes two communication rounds to receive a commit message from
the leader number and 𝑝𝑜𝑠 denotes the position in the ledger. For 𝐿1 after 𝑙𝑜𝑐𝑘1 is received. When the timer of either 𝛥1 or 𝛥2 expires,
brevity, we describe the protocol for a request at a particular position the view change is triggered.
in this section and omit the position fields in the messages. Each replica
participates in the processing of both leaders’ requests. Therefore, for a
position 𝑘, each replica may store two lock data. We denote these two 3.4. View-change protocol
lock data sent from 𝐿1 and 𝐿2 as 𝑙𝑜𝑐𝑘1 and 𝑙𝑜𝑐𝑘2 , respectively.
We simplify the protocol with an abstraction of the message broad- We refer to the view-change process that switches from view 𝑣 to
cast and collection. As shown in Fig. 3(b), we refer to a message view 𝑣 + 1 as the view-change process of 𝑣. Additionally, we denote
broadcast from a leader as 𝐵 (broadcast ) and a message collection two leaders in view 𝑣 as 𝐿𝑣1 and 𝐿𝑣2 , respectively. Our adaption to the
plus a message broadcast as 𝐶𝐵 (collection & broadcast ). Therefore, the view-change protocol includes two parts. First, we enable a request
protocol related to one leader can be considered as one 𝐵, followed
to be committed through the view-change messages. Second, to enable
by two consecutive 𝐶𝐵. We name the process of 𝐵 + 𝐶𝐵 + 𝐶𝐵 with 𝐿
requests from 𝐿𝑣2 to be committed, we add two more rounds in the view-
being the leader as an agreement process led by 𝐿. At the end of the
change process. To simplify the presentation, we describe the rules
first 𝐶𝐵 of either agreement process, denoted by 𝐶𝐵11 or 𝐶𝐵12 , each
related to one position in the ledger and describe how to commit or
replica can store lock data. At the end of the second 𝐶𝐵, denoted by
𝐶𝐵21 or 𝐶𝐵22 , it is different between the two agreement processes. A repropose a request at this position. Besides, when we mention lock
replica that receives a commit message from the agreement led by 𝐿1 data possessed by a replica 𝑝𝑖 , we mean the lock data with the largest
will commit the request directly. By contrast, a replica that receives wave number possessed by 𝑝𝑖 . The view-change protocol in Doppel is
a commit message from the agreement led by 𝐿2 will not commit the depicted in Fig. 5. There are three kinds of view-change messages in
request. Instead, it will consider broadcasting a view-change message. Doppel, denoted by VC1, VC2, and VC3, respectively, each of which
The clients will send their requests to both leaders simultaneously. corresponds to a broadcast phase. Next, we will detail these three
After receiving a request, either leader will propose this request by phases one by one.

5
R. Hao et al. Journal of Systems Architecture 148 (2024) 103087

Fig. 4. Different cases of the consensus processes led by two leaders.

Algorithm 2 Phase 𝑉 𝐶2 in the view-change process of view 𝑣 in


Doppel (for replica 𝑝𝑖 )
Let 𝑠𝑖𝑔𝑛 and 𝑐𝑜𝑚𝑏𝑖𝑛𝑒 denote the threshold signature functions, and
𝑠𝑘𝑖 represents the threshold secret share.

1: upon receiving 𝑛 − 𝑓 VC1 messages, do:


2: if all these VC1 messages are ⟨VC1, 𝑙𝑜𝑐𝑘1 ⟩, then:
3: commit the request from 𝐿𝑣1
4: else if there is one ⟨VC1, 𝑙𝑜𝑐𝑘1 ⟩, then:
5: ⊳ create partial threshold signatures 𝜌1 and 𝜌2
6: 𝜌1 ← 𝑠𝑖𝑔𝑛𝑛−𝑓 (𝑠𝑘𝑖 , VC2, LOCK1)
7: 𝜌2 ← 𝑠𝑖𝑔𝑛𝑛−𝑓 (𝑠𝑘𝑖 , VC2, NON-LOCK2)
8: broadcast ⟨VC2, 𝑙𝑜𝑐𝑘1 , 𝜌1 , 𝜌2 ⟩
Fig. 5. Overview of the view-change protocol in Doppel. 9: else: ⊳ all the data is ⊥
10: 𝑠𝜌 ← all the 𝜌 from VC1 messages
Algorithm 1 Phase 𝑉 𝐶1 in the view-change process of view 𝑣 in 11: 𝜎 ← 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑛−𝑓 (𝑠𝜌 ) ⊳ combine signatures
Doppel (for replica 𝑝𝑖 ) 12: if 𝑝𝑖 has 𝑙𝑜𝑐𝑘2 , then:
Let 𝑠𝑖𝑔𝑛 denotes the threshold signature function and 𝑠𝑘𝑖 represents 13: broadcast ⟨VC2, 𝑙𝑜𝑐𝑘2 , 𝜎⟩
the threshold secret share. 14: else:
15: 𝜌 ← 𝑠𝑖𝑔𝑛𝑛−𝑓 (𝑠𝑘𝑖 , VC2, NON-LOCK2)
1: if 𝑝𝑖 has 𝑙𝑜𝑐𝑘1 , then: 16: broadcast ⟨VC2, ⊥, 𝜎, 𝜌⟩
2: broadcast ⟨VC1, 𝑙𝑜𝑐𝑘1 ⟩
3: else:
4: 𝜌 ← 𝑠𝑖𝑔𝑛𝑛−𝑓 (𝑠𝑘𝑖 , VC1, ⊥) ⊳ create partial threshold signature 𝜌
into a complete threshold signature 𝜎. Additionally, it will also
5: broadcast ⟨VC1, ⊥, 𝜌⟩
create a partial threshold signature 𝜌 on a tag named NON-
LOCK2. Then, it will broadcast ⊥, 𝜎, and 𝜌.

3.4.1. Phase 𝑉 𝐶1 Rules of phase 𝑉 𝐶2 are described by Algorithm 2. Note that after
Algorithm 1 describes the rules in phase 𝑉 𝐶1. Each replica will receiving the lock data in a message, the replica will check the validity
only broadcast the 𝑙𝑜𝑐𝑘1 data if it has one, which is consistent with of the lock data. Only if the lock data passes the checking can it be
the setting where L1 has a higher priority than L2. Otherwise, it will accepted. For brevity, we omit this validity checking in both the above
broadcast empty data, denoted by ⊥. Moreover, if a replica wants to descriptions and Algorithm 2. Besides, three kinds of data (i.e., 𝑙𝑜𝑐𝑘1 ,
broadcast ⊥, it will create a partial threshold signature on ⊥ and broad- 𝑙𝑜𝑐𝑘2 , and ⊥) can be broadcast in phase 𝑉 𝐶2, as opposed to two kinds
cast this signature along with ⊥, as shown by Lines 4–6 in Algorithm of data (i.e., 𝑙𝑜𝑐𝑘1 and ⊥) in phase 𝑉 𝐶1. In phase VC1, only 𝑙𝑜𝑐𝑘1 and ⊥
1. The partial threshold signature serves as the replica’s verification of data can be broadcast, ensuring the priority of leader L1. In phase VC2,
receiving empty data, ensuring the safety of the protocol. the protocol allows the broadcast of 𝑙𝑜𝑐𝑘2 , aiming to achieve the goal
of committing the request during the view-change process and thereby
reducing system latency.
3.4.2. Phase 𝑉 𝐶2
After receiving 𝑛 − 𝑓 VC1 messages, the replica will take actions
3.4.3. Phase 𝑉 𝐶3
based on their contents, which includes four cases: Similar to 𝑉 𝐶2, the replica will also take actions based on the data
• If all these VC1 messages contain 𝑙𝑜𝑐𝑘1 , the replica can directly contained in 𝑛 − 𝑓 VC2 messages, which includes four cases as well:
commit the request proposed by 𝐿𝑣1 if it has not already done so. • If all these VC2 messages contain 𝑙𝑜𝑐𝑘2 , the replica can directly
• If both 𝑙𝑜𝑐𝑘1 and ⊥ are contained, the replica will create two commit the request proposed by 𝐿𝑣2 .
partial threshold signatures 𝜌1 and 𝜌2 on tags named LOCK1 and • If all these VC2 messages contain 𝑙𝑜𝑐𝑘1 , the replica will create
NON-LOCK2, respectively. These two partial signatures will be a complete threshold signature 𝜎 based on 𝜌1 contained in these
used for different cases in phase 𝑉 𝐶3. The replica then broadcasts messages, and then broadcast 𝑙𝑜𝑐𝑘1 and 𝜎.
𝑙𝑜𝑐𝑘1 , 𝜌0 , and 𝜌1 . • If there is at least one 𝑙𝑜𝑐𝑘2 and there are less than 𝑛 − 𝑓 𝑙𝑜𝑐𝑘2 ,
• If all the data is ⊥ while the replica has the 𝑙𝑜𝑐𝑘2 data, the replica the replica will broadcast 𝑙𝑜𝑐𝑘2 .
will combine the partial threshold signatures in these messages • If all the messages contain either 𝑙𝑜𝑐𝑘1 or ⊥ and there is at least
into a complete threshold signature 𝜎, broadcast 𝑙𝑜𝑐𝑘2 and 𝜎. one ⊥, the replica will create a complete threshold signature 𝜎 on
• If all the data is ⊥ and the replica has no 𝑙𝑜𝑐𝑘2 data, the replica NON-LOCK2 based on the 𝜌2 or 𝜌 data. It will then broadcast ⊥
will combine the partial threshold signatures in these messages and 𝜎.

6
R. Hao et al. Journal of Systems Architecture 148 (2024) 103087

Algorithm 3 Phase 𝑉 𝐶3 in the view-change process of view 𝑣 in Proof. For a position in a view, there can be two kinds of requests,
Doppel (for replica 𝑝𝑖 ) which are proposed by two leaders respectively. Regarding a request
Let 𝑠𝑖𝑔𝑛 and 𝑐𝑜𝑚𝑏𝑖𝑛𝑒 denote the threshold signature functions, and 𝑟𝑒𝑞1 proposed by the first leader, it may be committed at two time
𝑠𝑘𝑖 represents the threshold secret share. points: (1.1) when the commit data of 𝑞 is received in the normal-case
process and (1.2) when 𝑛 − 𝑓 VC1 messages of 𝑟𝑒𝑞1 are received at
1: upon receiving 𝑛 − 𝑓 VC2 messages, do: the beginning of phase VC2 (Line 3 of Algorithm 2). A request 𝑟𝑒𝑞2
2: if all these VC2 messages are ⟨VC2, 𝑙𝑜𝑐𝑘2 ⟩, then: proposed by the second leader may also be committed at two time
3: commit the request from 𝐿𝑣2 points: (2.1) when 𝑛 − 𝑓 VC2 messages of 𝑟𝑒𝑞2 are received at the
4: else if all these VC2 messages are ⟨VC2, 𝑙𝑜𝑐𝑘1 , 𝜌1 , 𝜌2 ⟩, then: beginning of phase VC3 (Line 3 of Algorithm 3) and (2.2) when 𝑛 − 𝑓
5: 𝑠𝜌 ← all the 𝜌1 from VC2 messages VC3 messages of 𝑟𝑒𝑞2 are received at the end of phase VC3 (Line 16 of
6: 𝜎 ← 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑛−𝑓 (𝑠𝜌 ) ⊳ combine signatures Algorithm 3). Denote two replicas as 𝑝𝑥 and 𝑝𝑦 , who commit 𝑟𝑒𝑞 and
7: broadcast ⟨VC3, 𝑙𝑜𝑐𝑘1 , 𝜎⟩ 𝑟𝑒𝑞 ′ , respectively. Next, we prove the lemma in four cases.
8: else if there is one ⟨VC2, 𝑙𝑜𝑐𝑘2 ⟩, then: Case 1 (𝑝𝑥 commits 𝑟𝑒𝑞 at point (1.1)). In this case, 𝑟𝑒𝑞 is a request
9: broadcast ⟨VC3, 𝑙𝑜𝑐𝑘2 ⟩ proposed by the first leader, and at least 𝑓 + 1 non-faulty replicas must
10: else: ⊳ messages contain either 𝑙𝑜𝑐𝑘1 or ⊥ possess lock data of 𝑟𝑒𝑞, each of which will broadcast this lock data in
11: 𝑠𝜌 ← all the 𝜌2 from ⟨VC2, 𝑙𝑜𝑐𝑘1 , 𝜌1 , 𝜌2 ⟩ messages and all the 𝜌 phase VC1. According to the rules in Section 3.4.2, each replica will
from ⟨VC2, ⊥, 𝜎, 𝜌⟩ messages broadcast lock data of 𝑟𝑒𝑞 in phase VC2. Note that since Byzantine
12: 𝜎 ← 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑛−𝑓 (𝑠𝜌 ) replicas will receive at least one lock data of 𝑟𝑒𝑞, they cannot create
13: broadcast ⟨VC3, ⊥, 𝜎⟩ a complete threshold signature 𝜎 (as Lines 9–10 of Algorithm 2). In
other words, even the Byzantine replicas can only broadcast lock data
14: upon receiving 𝑛 − 𝑓 VC3 messages, do: of 𝑟𝑒𝑞 in phase VC2. Therefore, all VC2 messages received by a replica
15: if all these VC3 messages are ⟨VC3, 𝑙𝑜𝑐𝑘2 ⟩, then: contain lock data of 𝑟𝑒𝑞, and each replica will broadcast this lock data
16: commit the request from 𝐿𝑣2 in phase VC3 (Lines 4–7 of Algorithm 3). Thus, in this case, all messages
17: if 𝑝𝑖 is a leader in view 𝑣 + 1, then: broadcast in the view-change process contain locks of 𝑟𝑒𝑞. Each replica,
18: if there is one message of ⟨VC3, 𝑙𝑜𝑐𝑘2 ⟩, then: including 𝑝𝑦 , can only commit 𝑟𝑒𝑞, at either point (1.1) or point (1.2).
19: repropose the request from 𝐿𝑣2 Case 2 (𝑝𝑥 commits 𝑟𝑒𝑞 at point (1.2)). In this case, 𝑟𝑒𝑞 is also a
20: else if there is one message of ⟨VC3, 𝑙𝑜𝑐𝑘1 , 𝜎⟩, then: request proposed by the first leader. According to Case 1, if 𝑝𝑦 commits
21: repropose the request from 𝐿𝑣1 𝑟𝑒𝑞 ′ at point (1.1), then 𝑟𝑒𝑞 ′ = 𝑟𝑒𝑞, and thereby proving the lemma.
22: else: Next, we assume 𝑝𝑦 commits at points except (1.1). Since 𝑝𝑥 commits
23: propose a request arbitrarily with a proof 𝑟𝑒𝑞 at point (1.2), each replica will receive at least one lock data of
𝑟𝑒𝑞 in VC1 messages and can only broadcast this lock data in VC2
messages. The remaining is similar to Case 1, and each replica can only
commit 𝑟𝑒𝑞.
At the end of phase 𝑉 𝐶3, the replica will take actions based on Case 3 (𝑝𝑥 commits 𝑟𝑒𝑞 at point (2.1)). In this case, 𝑟𝑒𝑞 is a request
the received 𝑛 − 𝑓 VC3 messages. If all the VC3 messages contain proposed by the second leader. According to the above three cases, if
𝑙𝑜𝑐𝑘2 , the replica will commit the request from 𝐿𝑣2 . Rules of phase 𝑉 𝐶3 𝑝𝑦 commits 𝑟𝑒𝑞 ′ at points (1.1) or (1.2), 𝑟𝑒𝑞 = 𝑟𝑒𝑞 ′ , and the lemma is
are described in Algorithm 3. We also omit the checking of threshold proved. If 𝑝𝑦 commits 𝑟𝑒𝑞 ′ at point (2.1), then 𝑟𝑒𝑞 = 𝑟𝑒𝑞 ′ according to
signatures for brevity. the quorum mechanism. Next, we consider that 𝑝𝑦 commits at point
After the three view-change phases are finished, the second leader (2.2). Since 𝑝𝑥 commits 𝑟𝑒𝑞 at point (2.1), 𝑝𝑥 must receive at least 𝑛 − 𝑓
(i.e., 𝑝1 in Fig. 5) in the previous view becomes the first leader in VC2 messages that contain lock data of 𝑟𝑒𝑞. Therefore, each replica
the new view, and a new replica (i.e., 𝑝2 in Fig. 5) is designated as will receive at least one VC2 message containing this lock data and
the second leader. Both leaders will broadcast new-view messages, as broadcast this lock data in the VC3 message (Lines 8–9 of Algorithm
shown by 𝐵-𝑁𝑉1 and 𝐵-𝑁𝑉2 in Fig. 5. Both leaders in the new view 3). Therefore, 𝑝𝑦 can only commit 𝑟𝑒𝑞 if it commits at point (2.2).
must determine the request for each position whose rules are shown Case 4 (𝑝𝑥 commits 𝑟𝑒𝑞 at point (2.2)). In this case, 𝑟𝑒𝑞 is also
by Lines 17–23 of Algorithm 3. Roughly speaking, if there is lock data a request proposed by the second leader. According to the three cases
for a request, it will repropose this request; otherwise, it will repropose mentioned above, if 𝑝𝑦 commits 𝑟𝑒𝑞 ′ at any point except (2.2), then
a request arbitrarily with a proof. This proof can be simply set as the
𝑟𝑒𝑞 = 𝑟𝑒𝑞 ′ , thereby proving the lemma. Next, we assume 𝑝𝑦 commits
combination of received VC3 messages. Every replica that receives new
𝑟𝑒𝑞 ′ at point (2.2), which means both 𝑝𝑥 and 𝑝𝑦 commit requests based
leaders’ requests will check the correctness of these proofs.
on VC3 messages. According to the quorum mechanism, 𝑟𝑒𝑞 ′ must be
Latency: If 𝐿𝑣1 is faulty while 𝐿𝑣2 is non-faulty, each replica can
identical to 𝑟𝑒𝑞. □
receive 𝑛 − 𝑓 𝑙𝑜𝑐𝑘2 data at the beginning of phase 𝑣𝑐3 and commit the
request from 𝐿𝑣2 directly. As a result, it takes 7𝛿 for a request sent from
Lemma 2. If a replica commits a request 𝑟𝑒𝑞 at position 𝑘 in view 𝑣 or in
𝐿𝑣2 to get committed, which is much smaller than 10𝛿 + 𝛥 by PBFT-like
the view-change process of 𝑣, then every leader in view 𝑤(𝑤 > 𝑣) reproposes
protocols (presented in Section 2.3).
𝑟𝑒𝑞 at 𝑘.
4. Correctness analysis
Proof. We denote the replica as 𝑝𝑥 and prove the lemma in three cases.
The correctness analysis of Doppel includes two fundamental as- Case 1 (𝑝𝑥 commits 𝑟𝑒𝑞 at either point (1.1) or point (1.2)). In
pects: safety and liveness. this case, 𝑟𝑒𝑞 is a request proposed by the first leader in the previous
view. According to Case 1 or Case 2 in the proof of Lemma 1, each
4.1. Safety analysis replica will broadcast the lock data of 𝑟𝑒𝑞 in phase VC3. Therefore,
leaders in the subsequent views will repropose 𝑟𝑒𝑞, as shown by Lines
The safety property is interpreted as Theorem 3, whose proof relies 20–21 of Algorithm 3.
on two lemmas, namely Lemmas 1 and 2. Case 2 (𝑝𝑥 commits 𝑟𝑒𝑞 at point (2.1)). In this case, 𝑟𝑒𝑞 is a request
proposed by the second leader in the previous view. According to Case
Lemma 1. If two replicas commit two requests 𝑟𝑒𝑞 and 𝑟𝑒𝑞 ′ at the same 3 in the proof of Lemma 1, each replica will also broadcast the lock data
position either in view 𝑣 or in the view-change process of 𝑣, then 𝑟𝑒𝑞 = 𝑟𝑒𝑞 ′ . of 𝑟𝑒𝑞 in phase VC3. Therefore, leaders in the subsequent views can

7
R. Hao et al. Journal of Systems Architecture 148 (2024) 103087

only receive VC3 messages that contain this lock data and repropose that attempt to improve the consensus throughput by introducing the
𝑟𝑒𝑞 according to Lines 18–19 of Algorithm 3. block and chain structures, PBFT is still the best-studied and most
Case 3 (𝑝𝑥 commits 𝑟𝑒𝑞 at point (2.2)). In this case, 𝑟𝑒𝑞 is also a widely deployed up to now. Besides, the improvements conducted by
request proposed by the second leader in the previous view. It is easy these works are orthogonal to our design in this paper and we plan to
to know that each replica will receive at least one VC3 message that combine them in our future work.
contains the lock data of 𝑟𝑒𝑞. According to Lines 18–19 of Algorithm 3,
both leaders will repropose 𝑟𝑒𝑞. □ 5.1. Implementation & settings

Theorem 3 (Safety). If two replicas commit two requests 𝑟𝑒𝑞 and 𝑟𝑒𝑞 ′ at A blockchain system prototype of Doppel has been implemented
the same position, respectively, then 𝑟𝑒𝑞 = 𝑟𝑒𝑞 ′ . using the Golang programming language. To ensure fairness in our eval-
uation, we have also developed a PBFT implementation and an SBFT
Proof. Denote these two replicas as 𝑝𝑥 and 𝑝𝑦 , respectively. Addi- implementation from scratch using the same framework as Doppel. The
tionally, assume that 𝑝𝑥 commits 𝑟𝑒𝑞 in view 𝑣 or in the view-change total lines of code in our implementation sum up to approximately
process of 𝑣. Similarly, assume that 𝑝𝑦 commits 𝑟𝑒𝑞 ′ in view 𝑢 or in the 3250. To make the configuration consistent among different protocols,
view-change process of 𝑢 We prove this theorem in two cases: we set the parameter of 𝑐 in SBFT to 0 and disable its optimistic path.
To aid in the implementation, we have utilized various open-source
• Case 1 (𝑣 = 𝑢): This case indicates that both 𝑝𝑥 and 𝑝𝑦 commits
libraries. For instance, we have leveraged the DEDIS advanced crypto li-
the request proposed by the leaders in the view 𝑣(𝑢) or in the
brary2 for the implementation of threshold signatures. Additionally, we
view-change process of 𝑣(𝑢). According to Lemma 1, we must have
have utilized the Zinx library3 to handle communication aspects within
𝑟𝑒𝑞 = 𝑟𝑒𝑞 ′ .
the system. All of our experiments have been conducted on the Google
• Case 2 (𝑣 ≠ 𝑢): Without loss of generality, we assume 𝑣 < 𝑢.
Cloud platform. Each replica has been deployed on an e2-standard-8
According to Lemma 2, if 𝑝𝑥 commits 𝑟𝑒𝑞 in view 𝑣 or in the
instance, which consists of 8 vCPUs and 32 GB of memory. The replicas
view-change process of 𝑣, each leader in view 𝑤(𝑤 > 𝑣) will only
have been distributed across nine different regions globally, simulating
repropose 𝑟𝑒𝑞. Therefore, each replica that commits requests in
a realistic distributed environment. To emulate network connectivity,
view 𝑤(𝑤 > 𝑣), including 𝑝𝑦 , will only commit 𝑟𝑒𝑞.
each pair of replicas is connected via a network link with a bandwidth
To sum up, if replicas commit requests at the same position, these of 200 Mbps. This setup allows us to evaluate the performance of
requests must be identical, which concludes the proof. □ Doppel , PBFT, and SBFT in a distributed environment with realistic
network conditions.
4.2. Liveness analysis
5.2. Performance under a non-faulty leader
Since a client can send a request to every replica and leaders are
selected in a round-robin manner, as long as the view is switched from In this group of experiments, we have ensured that the leaders are
one to the next, this request can eventually be proposed and committed. non-faulty, allowing Doppel, PBFT, and SBFT to successfully commit
In other words, if any replica, including the leader in the next view, requests within a view. For the latency comparison, we have considered
can finish the view-change process of a previous view and enter the two types of payloads: 0 KB and 100 KB. The purpose of the 0 KB
normal-case process of the next view successfully, the liveness can be payload is to evaluate the consensus performance independently, with-
guaranteed. Therefore, we interpret the liveness property as Theorem 4 out being affected by the payload transfer. On the other hand, the 100
. Besides, partially synchronous consensus protocols can only guarantee KB payload simulates a more realistic scenario where payload transfer
liveness after GST. As a result, we will analyze the liveness, directly is involved. We conducted experiments in six different settings, each
assuming it is already after GST. involving a different number of replicas ranging from 4 to 40. To ensure
statistical significance and reduce experimental errors, each group of
Theorem 4 (Liveness). If a view 𝑣 fails to make progress in committing experiments has been repeated ten times. Regarding the parameter
requests, each replica can eventually finish the view-change process of 𝑣 and settings, we set 𝛥2 in Doppel as the estimated delay of two commu-
enter the normal-case process of 𝑣 + 1 successfully. nication rounds. Additionally, we set both 𝛥 in PBFT/SBFT and 𝛥1 in
Doppel as ten times the value of 𝛥2 . These parameter choices allow us
Proof. If a view 𝑣 fails to make progress, each non-faulty replica to appropriately capture and evaluate the performance characteristics
will eventually broadcast a view-change message. In other words, each of all three protocols.
non-faulty replica will participate in the view-change process of 𝑣. The latency comparison results are presented in Fig. 6(a). Overall,
From three algorithms described in Section 3.4, we can find that a Doppel achieves performance comparable to PBFT or SBFT, regardless
replica only expects to wait for 𝑛 − 𝑓 messages in each phase. Based of the payload size (0 KB or 100 KB) or the scale of the system.
on the received messages of the previous phase, a replica can smoothly Specifically, when the number of replicas is 40, Doppel, PBFT, and SBFT
broadcast messages of the next phase without being halted. As a result, exhibit latencies of 630.5 ms, 601 ms, and 591.5 ms, respectively, with
each replica can successfully finish the view-change process of 𝑣 and a 0 KB payload. With a 100 KB payload, the latencies are 875.5 ms
enter the normal-case process of 𝑣 + 1. Besides, regarding leaders of for Doppel, 856 ms for PBFT, and 863 ms for SBFT. These results
𝑣 + 1, they can determine the request for each position according to demonstrate that introducing an additional leader in the view does not
Lines 17–23 of Algorithm 3 without being halted. Therefore, liveness hurt latency when the leader is non-faulty.
is guaranteed. □ Regarding the throughput comparison, we configured the number
of replicas to 4 and 19, respectively. As we increase the payload size,
5. Implementation & evaluation namely from 200 KB to 1200 KB, we anticipate a rise in consensus
throughput, eventually reaching its peak. Throughput, measured in
To evaluate the effectiveness of our proposed design, we imple- Bytes Per Second (BPS), is illustrated in Fig. 6(b) for the three pro-
mented a system prototype of Doppel and conducted a series of com- tocols compared. Consistent with the latency comparison results, the
prehensive experiments. The primary objective of these experiments
was to compare the performance of Doppel with that of PBFT [21]
and SBFT [22], with a specific focus on latency and throughput. Note 2
https://github.com/dedis/kyber
3
that although there are also some other works, such as HotStuff [23], https://github.com/aceld/zinx

8
R. Hao et al. Journal of Systems Architecture 148 (2024) 103087

Fig. 6. Performance under a non-faulty leader. Fig. 7. Performance under a faulty leader with 𝑙𝑜𝑐𝑘1 .

throughput of these protocols demonstrates similar performance. An PBFT, which involves only one round of quadratic communication. The
interesting observation is that the addition of an extra leader does not network congestion worsens as the number of replicas increases.
seem to negatively impact Doppel’s performance a lot. This could be The experimental results under Case 3 are presented in Fig. 8.
attributed to the parallel data broadcast conducted by the two leaders, Similar to Case 2, Doppel achieves significantly lower latency and
thereby distributing the load evenly and not significantly burdening higher throughput than the other two protocols, particularly when the
any one leader. Consequently, this parallel process contributes to the number of replicas is small. However, unlike Case 2, Doppel in Case
good performance observed. 3 continues to outperform PBFT by a significant margin even as the
number of replicas increases to a large value. Specifically, when the
system consists of 40 replicas, Doppel can reduce the latency by 56.6%
5.3. Performance under a faulty leader
or 54.5% compared to PBFT or SBFT. In terms of the throughput, when
the replica count is four and the load is set to 100 KB, Doppel can
In the experiments focusing on the faulty leader scenario, we con- improve the throughput by 68.8% or 80.6%, respectively. This is
sider two cases: Case 2 and Case 3. In Case 2, the first leader (𝐿1 ) attributed to the fact that Doppel in Case 3 can trigger the view-change
in Doppel becomes faulty after receiving the 𝑙𝑜𝑐𝑘1 message. In this process earlier without waiting for a timer to expire.
situation, when a replica receives the commit message from the second
leader (𝐿2 ), it can immediately trigger the view-change process. In Case 5.4. Performance under dynamic network
3, 𝐿1 becomes faulty without receiving the 𝑙𝑜𝑐𝑘1 message. In this case,
a replica will only trigger the view-change process only when either To assess our optimization’s functionality in a dynamic network, we
the timer for 𝛥1 or 𝛥2 expires. We compare the performance of Doppel, introduced random delays to the messages, uniformly sampled from
PBFT, and SBFT under these two scenarios in this group of experiments. an interval of [0, 100 ms]. With these added network delays, we
compared protocols’ performance under three distinct settings: non-
The experimental results under Case 2 are presented in Fig. 7. Over-
faulty leaders, faulty leaders with 𝑙𝑜𝑐𝑘1 , and faulty leaders without
all, Doppel achieves lower latency and higher throughput compared
𝑙𝑜𝑐𝑘1 . We configured the system scale at 4 and 19, respectively. By
to PBFT or SBFT, particularly when the number of replicas is small.
varying the payload size, we scrutinized each protocol’s performance,
Specifically, when the system consists of four replicas, Doppel achieves examining the trade-off between throughput and latency. The results
an 86.1% or 84.6% reduction in latency compared to PBFT or SBFT. are illustrated in Fig. 9.
As for the throughput, when the load is set to 100 KB and the replica Observing Fig. 9(a), it is evident that with non-faulty replicas, Dop-
count is four, Doppel outperforms PBFT or SBFT by 77.6% or 83.6%, pel demonstrates performance on par with PBFT or SBFT. In scenarios
respectively. However, as the number of replicas increases, the perfor- involving faulty leaders, Doppel consistently outperforms PBFT or SBFT
mance gap between Doppel and its counterparts becomes smaller. This in terms of both latency and throughput, aligning with the conclusions
can be attributed to the fact that in Doppel, two rounds of quadratic drawn in Section 5.3. In summary, Doppel exhibits the capability to
communication contribute to higher network congestion compared to enhance consensus performance even within a dynamic network.

9
R. Hao et al. Journal of Systems Architecture 148 (2024) 103087

smart contracts to monitor data behavior and revoke access to violated


data in case of unauthorized actions by the requester.
Several research works have explored the use of blockchain tech-
nology in establishing decentralized self-governing intelligent transport
systems. With autonomous vehicles playing a crucial role in the future
transportation infrastructure, machine-to-machine (M2M) transactions
will become predominant. Pedrosa and Pau [8] highlighted the po-
tential of blockchain in providing secure, flexible, and scalable M2M
transactions for vehicle energy recharging. Rathore et al. [33] pro-
posed a decentralized technique to enhance the security of on-board
sensors and communication channels in intelligent transport systems.
Their focus was on improving system efficiency and scalability while
defending against attacks. Singh and Kim [9] utilized blockchain tech-
nology to ensure secure and reliable communication among connected
intelligent vehicles. They introduced a reward mechanism based on
trust bits exchanged during successful communication, and the trust bit
information was recorded and maintained on the blockchain.
The establishment of a smart grid system is crucial for achieving an
efficient and environmentally friendly energy system with a low carbon
footprint. Researchers have recognized the potential of blockchain
technology in improving the management and integrity of smart grid
systems. Khaqqi et al. [11] proposed a novel emission trading scheme
model that leverages blockchain technology to enhance management
efficiency and prevent fraudulent activities. The model also incor-
porates a reputation mechanism within the trading system, where
reputation serves as an indicator of participants’ performance and
commitment to emission reduction efforts. Pop et al. [12] introduced a
distributed blockchain-based energy demand management system. This
system records energy consumption data collected from smart metering
devices and utilizes smart contracts to programmatically define each
prosumer’s expected energy flexibility. The system aims to balance
energy demand with energy production in an efficient and automated
Fig. 8. Performance under a faulty leader without 𝑙𝑜𝑐𝑘1 . manner.
Our research differs from existing studies that apply blockchain to
specific domains or make minor modifications to consensus algorithms
6. Related work for specific use cases. Instead, our focus is on enhancing the efficiency
of the consensus protocol itself. By doing so, our work has broader
6.1. Blockchain for CPS implications for all blockchain-enabled CPS that demand high security
and rapid response.
Blockchain technology has been successfully applied to a range
6.2. Consensus algorithm
of industry domains, including manufacturing [2,3,31], healthcare
[6,7,32], transportation [8,9,33] and energy management [11,12].
In the field of consensus algorithms, different timing assumptions
In the field of manufacturing, Angrish et al. [2] developed a pro-
give rise to three categories: synchronous, asynchronous, and partially
totype called FabRec, which utilizes blockchain to ensure trust and
synchronous protocols.
transparency in manufacturing processes. FabRec establishes a decen-
The concept of synchronous networks, originating from the work
tralized network of manufacturing machines and computing nodes, of Lamport et al. [25], has influenced the design of early consensus
enabling automated transparency of an organization’s capabilities and protocols such as Rampart [34] and SecureRing [35]. Recent works like
facilitating third-party verification. Additionally, smart contracts are Pili [36] and Sync HotStuff [37] also adopt the synchronous-network
employed to facilitate paperless contracts between participants. In the assumption. Synchronous protocols are known for their simplicity and
realm of cloud manufacturing, Li et al. [3] proposed a distributed ease of implementation. However, this assumption can be easily vi-
peer-to-peer network architecture based on blockchain. This archi- olated in an unstable network environment, jeopardizing the safety
tecture enhances the security and scalability of cloud manufacturing property of the consensus algorithm.
by integrating resource, perception, manufacturing, infrastructure, and Setting a large timeout parameter 𝛥 to maintain safety can signif-
application layers. icantly impact consensus efficiency. Bitcoin [14], for example, has a
Blockchain technology has been extensively explored by researchers block interval of ten minutes, limiting its transaction processing capac-
to address various challenges, including personal health data sharing ity to a maximum of 7 transactions per second (TPS). Ethereum [38]
and protection, as well as the prevention of counterfeit medications. improved upon Bitcoin by reducing the block interval to approxi-
To address these issues, Ekblaw et al. [6] introduced MedRec, a decen- mately fifteen seconds, allowing for a maximum of 15 TPS. Despite this
tralized electronic health records management system. MedRec utilizes improvement, the scalability of these protocols remains limited.
blockchain properties to enable secure data sharing, access authentica- The FLP theorem [28] established the impossibility of designing
tion, confidential information protection, and accountability. Similarly, a deterministic consensus protocol in an asynchronous network with
Xia et al. [7] focused on the problem of medical data sharing among even one faulty replica. Despite this result, extensive research has
untrusted custodians of medical big data and proposed MeDShare. been conducted to overcome this impossibility by adopting different
MeDShare is a blockchain-based system that offers medical data access approaches. One approach is to relax the determinism requirement
control, data provenance, and auditing. Notably, MeDShare employs and introduce randomness into the protocol design. Early examples of

10
R. Hao et al. Journal of Systems Architecture 148 (2024) 103087

Fig. 9. Performance under dynamic network.

such asynchronous protocols include SINTRA [39] and CKPS01 [40]. incorporating a score grouping mechanism for selecting trusted nodes
Building upon these early attempts, recent works have made significant in consensus voting and achieving higher consensus efficiency. Zhong
progress in improving the efficiency and practicality of asynchronous et al. [60] presented a shard transaction PBFT consensus mechanism for
protocols. Examples of these works include HoneybadgerBFT [41], intellectual property transactions. They employed a grouping strategy
BEAT [42], Aleph [43], DAG-Rider [44], Tusk [45], and BullShark [46]. based on IP transaction types and developed an evaluation model to
It is true that asynchronous consensus protocols, which aim to pro- assess node reputation based on their behavior. Lastly, Hao et al. [61]
vide stronger robustness, can be challenging to understand and imple- proposed BitFT, which combines the strengths of both lottery-based
ment, especially for non-experts. Many of these protocols rely on com- and voting-based mechanisms. This combination aims to address the
plex algorithms such as Asynchronous Byzantine Agreement (ABA) [47] challenge of balancing high energy consumption and low efficiency in
or Multivalued Validated Byzantine Agreement (MVBA) [40,48] as consensus algorithms.
their core components. Moreover, The complexity of these protocols However, these previous works still suffer from high latency during
increases the likelihood of implementation bugs [49]. Furthermore, the view-change process as they designate a single leader in each
asynchronous protocols are designed with a conservative approach, view. While there are some existing works that introduce multiple
assuming the worst possible network conditions. While this design leaders in a view, such as RCC [62], Mir-BFT [63], Mencius [64], and
choice ensures safety even in challenging environments, it can also ISS [65], they do not effectively reduce latency during the view-change
result in suboptimal performance even when the network conditions process. Specifically, in these multiple-leader works, the additional
are relatively good. leaders are assigned to propose different requests, primarily focusing
Dwork et al. [27] were the pioneers in exploring the combination of on increasing system throughput. In contrast, our Doppel designates
synchronous and asynchronous networks, which laid the foundation for leaders to propose the same requests, resulting in reduced latency as
the development of partially synchronous consensus protocols. Among long as at least one leader is non-faulty. This approach distinguishes
these protocols, Practical Byzantine Fault Tolerance (PBFT) [21] stands our protocol from previous multiple-leader designs and contributes to
out as one of the most well-known and widely used consensus al- improved performance.
gorithms. PBFT successfully addresses the challenge of consensus in
partially synchronous networks by reducing the protocol complexity 7. Discussion & conclusion
from an exponential level to a more manageable polynomial level.
This breakthrough has made PBFT a de facto standard for consensus Blockchain-enabled cyber–physical systems have found applica-
protocols, and it has been adopted by numerous blockchain systems, tions in various domains, and the efficiency of the consensus mecha-
including Hyperledger Fabric [50] and ELASTICO [51]. nism is crucial for their practicality and scalability. Existing partially-
synchronous BFT protocols, such as PBFT and SBFT, suffer from high
6.3. Improved PBFT algorithm latency when the designated leader is faulty. To address this, we
propose Doppel, a two-leader BFT protocol built upon PBFT. Dop-
In the wake of PBFT’s foundational contributions to Byzantine pel achieves similar latency to PBFT-like protocols when the first leader
Fault Tolerance protocols, a multitude of remarkable works strive is non-faulty, but significantly reduces latency when the first leader is
to improve the performance of PBFT. SBFT [22] is one of the rep- faulty and the second leader is non-faulty. Experimental results validate
resentative works, which seeks to improve consensus efficiency by the feasibility and efficiency of Doppel. It is worth noting that the
introducing a fast-path commitment mechanism. Additionally, other concept of multiple leaders in Doppel can be extended to achieve low
works such as FastBFT [52] and Hybster [53] use trusted hardware to latency as long as one leader is non-faulty. However, it is important
address performance issues. Besides, with the emergence of blockchain to note that increasing the number of leaders in a view introduces
technology, data structures of blocks and chains also bring new inspi- complexity to the view-change protocol and increases the time required
rations to the consensus design, which spawns a line of works, such as to complete the view-change process. In a system with 𝑘 leaders, the
Tendermint [54], Pala [55], HotStuff [23], and Streamlet [56]. overall system latency for a request to be committed is 5𝛿+(𝑘−1) ∗ 2𝛿 in
Furthermore, several research studies have explored the adoption the view-change case. In contrast, if there is only one leader, the latency
and modification of the PBFT algorithm for application in cyber– is 𝛥 + 6𝛿. The choice of the number of leaders becomes dependent on
physical systems. Lao et al. [57] introduced a novel location-based and the system timeout threshold settings (𝛥) and the actual network delay
scalable consensus protocol tailored for IoT-blockchain applications. (𝛿). Therefore, there exists a trade-off between latency and view-change
They leveraged fixed IoT devices as endorsers, reducing transaction time when determining the number of leaders in a view.
validation overhead. Wu et al. [58] analyzed the viability of employ-
ing blockchain technology to safeguard data exchange in constrained CRediT authorship contribution statement
cyber–physical systems, enhancing system efficiency by eliminating the
commit phase in PBFT consensus. Xu et al. [59] proposed a secure and Rui Hao: Writing – review & editing, Writing – origi-
efficient blockchain-based model for intelligent Internet of Vehicles, nal draft, Visualization, Validation, Methodology, Investigation,

11
R. Hao et al. Journal of Systems Architecture 148 (2024) 103087

Conceptualization. Xiaohai Dai: Writing – review & editing, [19] Jean-Paul A. Yaacoub, Ola Salman, Hassan N. Noura, Nesrine Kaaniche, Ali
Writing – original draft, Methodology, Conceptualization. Xia Xie: Chehab, Mohamad Malli, Cyber-physical systems security: Limitations, issues and
future trends, Microprocess. Microsyst. 77 (2020) 103201.
Supervision, Resources, Project administration, Funding acquisition,
[20] Xiaokang Zhou, Wei Liang, I. Kevin, Kai Wang, Zheng Yan, Laurence T. Yang,
Conceptualization. Wei Wei, Jianhua Ma, Qun Jin, Decentralized P2P federated learning for privacy-
preserving and resilient mobile robotic systems, IEEE Wirel. Commun. 30 (2)
Declaration of competing interest (2023) 82–89.
[21] Miguel Castro, Barbara Liskov, et al., Practical byzantine fault tolerance, in:
OSDI, vol. 99, (no. 1999) 1999, pp. 173–186.
The authors declare that they have no known competing financial [22] Guy Golan Gueta, Ittai Abraham, Shelly Grossman, Dahlia Malkhi, Benny Pinkas,
interests or personal relationships that could have appeared to Michael Reiter, Dragos-Adrian Seredinschi, Orr Tamir, Alin Tomescu, Sbft: A
influence the work reported in this paper. scalable and decentralized trust infrastructure, in: 2019 49th Annual IEEE/IFIP
International Conference on Dependable Systems and Networks, DSN, IEEE,
2019, pp. 568–580.
Acknowledgments [23] Maofan Yin, Dahlia Malkhi, Michael K Reiter, Guy Golan Gueta, Ittai Abraham,
Hotstuff: Bft consensus with linearity and responsiveness, in: Proceedings of
the 2019 ACM Symposium on Principles of Distributed Computing, 2019, pp.
This work was supported in part by the National Natural Science
347–356.
Foundation of China (Grant No. 62362023 and No. 62202187). [24] Michael J. Fischer, The consensus problem in unreliable distributed systems
(a brief survey), in: International Conference on Fundamentals of Computation
References Theory, Springer, 1983, pp. 127–140.
[25] Leslie Lamport, Robert Shostak, Marshall Pease, The Byzantine generals problem,
in: Concurrency: The Works of Leslie Lamport, 2019, pp. 203–226.
[1] Keliang Zhou, Taigang Liu, Lifeng Zhou, Industry 4.0: Towards future industrial [26] Zibin Zheng, Shaoan Xie, Hong-Ning Dai, Xiangping Chen, Huaimin Wang,
opportunities and challenges, in: 2015 12th International Conference on Fuzzy Blockchain challenges and opportunities: A survey, Int. J. Web Grid Serv. 14
Systems and Knowledge Discovery, FSKD, IEEE, 2015, pp. 2147–2152. (4) (2018) 352–375.
[2] Atin Angrish, Benjamin Craver, Mahmud Hasan, Binil Starly, A case study for [27] Cynthia Dwork, Nancy Lynch, Larry Stockmeyer, Consensus in the presence of
blockchain in manufacturing: ‘‘FabRec’’: A prototype for peer-to-peer network of partial synchrony, J. ACM 35 (2) (1988) 288–323.
manufacturing nodes, Procedia Manuf. 26 (2018) 1180–1192, 46th SME North [28] Michael J. Fischer, Nancy A. Lynch, Michael S. Paterson, Impossibility of
American Manufacturing Research Conference, NAMRC 46, Texas, USA. distributed consensus with one faulty process, J. ACM 32 (2) (1985) 374–382.
[3] Zhi Li, Ali Vatankhah Barenji, George Q. Huang, Toward a blockchain cloud [29] Gabriel Bracha, An asynchronous [(n-1)/3]-resilient consensus protocol, in:
manufacturing system as a peer to peer distributed network platform, Robot. Proceedings of the Third Annual ACM Symposium on Principles of Distributed
Comput.-Integr. Manuf. 54 (2018) 133–144. Computing, 1984, pp. 154–162.
[4] Xiaokang Zhou, Yiyong Hu, Jiayi Wu, Wei Liang, Jianhua Ma, Qun Jin, Distri- [30] Christian Cachin, Klaus Kursawe, Victor Shoup, Random oracles in constantino-
bution bias aware collaborative generative adversarial network for imbalanced ple: Practical asynchronous Byzantine agreement using cryptography, J. Cryptol.
deep learning in industrial IoT, IEEE Trans. Ind. Inform. 19 (1) (2022) 570–580. 18 (3) (2005) 219–246.
[5] Xiaokang Zhou, Qiuyue Yang, Xuzhe Zheng, Wei Liang, I. Kevin, Kai Wang, [31] Xiaokang Zhou, Xuesong Xu, Wei Liang, Zhi Zeng, Shohei Shimizu, Laurence T.
Jianhua Ma, Yi Pan, Qun Jin, Personalized federation learning with model- Yang, Qun Jin, Intelligent small object detection for digital twin in smart
contrastive learning for multi-modal user modeling in human-centric metaverse, manufacturing with industrial cyber-physical systems, IEEE Trans. Ind. Inform.
IEEE J. Sel. Areas Commun. (2024). 18 (2) (2021) 1377–1386.
[6] Ariel Ekblaw, Asaph Azaria, John D Halamka, Andrew Lippman, A case study for [32] Xiaokang Zhou, Xiaozhou Ye, I. Kevin, Kai Wang, Wei Liang, Nirmal Kumar C.
blockchain in healthcare:‘‘MedRec’’ prototype for electronic health records and Nair, Shohei Shimizu, Zheng Yan, Qun Jin, Hierarchical federated learning with
medical research data, in: Proceedings of IEEE Open & Big Data Conference, vol. social context clustering-based participant selection for internet of medical things
13, 2016, p. 13. applications, IEEE Trans. Comput. Soc. Syst. (2023).
[7] Q.I. Xia, Emmanuel Boateng Sifah, Kwame Omono Asamoah, Jianbin Gao, [33] Heena Rathore, Abhay Samant, Murtuza Jadliwala, Amr Mohamed, TangleCV:
Xiaojiang Du, Mohsen Guizani, MeDShare: Trust-less medical data sharing among Decentralized technique for secure message sharing in connected vehicles, in:
cloud service providers via blockchain, IEEE Access 5 (2017) 14757–14767. Proceedings of the ACM Workshop on Automotive Cybersecurity, 2019, pp.
[8] Alejandro Ranchal Pedrosa, Giovanni Pau, Chargeltup: On blockchain-based 45–48.
technologies for autonomous vehicles, in: Proceedings of the 1st Workshop on [34] Michael K. Reiter, Secure agreement protocols: Reliable and atomic group
Cryptocurrencies and Blockchains for Distributed Systems, 2018, pp. 87–92. multicast in rampart, in: Proceedings of the 2nd ACM Conference on Computer
[9] Madhusudan Singh, Shiho Kim, Trust bit: Reward-based intelligent vehicle and Communications Security, 1994, pp. 68–80.
commination using blockchain paper, in: 2018 IEEE 4th World Forum on Internet [35] Kim Potter Kihlstrom, Louise E Moser, P. Michael Melliar-Smith, The SecureRing
of Things, WF-IoT, IEEE, 2018, pp. 62–67. protocols for securing group communication, in: Proceedings of the Thirty-First
Hawaii International Conference on System Sciences, vol. 3, IEEE, 1998, pp.
[10] Xiaokang Zhou, Xuzhe Zheng, Xuesong Cui, Jiashuai Shi, Wei Liang, Zheng Yan,
317–326.
Laurance T. Yang, Shohei Shimizu, I. Kevin, Kai Wang, Digital twin enhanced
[36] T.H. Hubert Chan, Rafael Pass, Elaine Shi, Pili: An extremely simple synchronous
federated reinforcement learning with lightweight knowledge distillation in
blockchain, 2018, Cryptology ePrint Archive.
mobile networks, IEEE J. Sel. Areas Commun. (2023).
[37] Ittai Abraham, Dahlia Malkhi, Kartik Nayak, Ling Ren, Maofan Yin, Sync hotstuff:
[11] Khamila Nurul Khaqqi, Janusz J. Sikorski, Kunn Hadinoto, Markus Kraft, In-
Simple and practical synchronous state machine replication, in: 2020 IEEE
corporating seller/buyer reputation-based system in blockchain-enabled emission
Symposium on Security and Privacy, SP, IEEE, 2020, pp. 106–118.
trading application, Appl. Energy 209 (2018) 8–19.
[38] Gavin Wood, et al., Ethereum: A secure decentralised generalised transaction
[12] Claudia Pop, Tudor Cioara, Marcel Antal, Ionut Anghel, Ioan Salomie, Massimo
ledger, Ethereum Project Yellow Pap. 151 (2014) (2014) 1–32.
Bertoncini, Blockchain based decentralized management of demand response
[39] Christian Cachin, Jonathan A. Poritz, Secure intrusion-tolerant replication on the
programs in smart energy grids, Sensors 18 (1) (2018) 162.
internet, in: Proceedings International Conference on Dependable Systems and
[13] Xiaokang Zhou, Wei Liang, I. Kevin, Kai Wang, Laurence T. Yang, Deep Networks, IEEE, 2002, pp. 167–176.
correlation mining based on hierarchical hybrid networks for heterogeneous big [40] Christian Cachin, Klaus Kursawe, Frank Petzold, Victor Shoup, Secure and
data recommendations, IEEE Trans. Comput. Soc. Syst. 8 (1) (2020) 171–178. efficient asynchronous broadcast protocols, in: Annual International Cryptology
[14] Satoshi Nakamoto, Bitcoin: A peer-to-peer electronic cash system, Decentralized Conference, Springer, 2001, pp. 524–541.
Bus. Review (2008) 21260. [41] Andrew Miller, Yu Xia, Kyle Croman, Elaine Shi, Dawn Song, The honey badger
[15] Xin Wang, Sisi Duan, James Clavin, Haibin Zhang, Bft in blockchains: From of BFT protocols, in: Proceedings of the 2016 ACM SIGSAC Conference on
protocols to use cases, ACM Comput. Surv. 54 (10s) (2022) 1–37. Computer and Communications Security, 2016, pp. 31–42.
[16] Yang Xiao, Ning Zhang, Wenjing Lou, Y. Thomas Hou, A survey of distributed [42] Sisi Duan, Michael K. Reiter, Haibin Zhang, BEAT: Asynchronous BFT made
consensus protocols for blockchain networks, IEEE Commun. Surv. Tutor. 22 (2) practical, in: Proceedings of the 2018 ACM SIGSAC Conference on Computer
(2020) 1432–1465. and Communications Security, 2018, pp. 2028–2041.
[17] Lakshmi Siva Sankar, M. Sindhu, M. Sethumadhavan, Survey of consensus [43] Adam Gągol, Damian Leśniak, Damian Straszak, Michał Świętek, Aleph: Efficient
protocols on blockchain applications, in: 2017 4th International Conference on atomic broadcast in asynchronous networks with byzantine nodes, in: Proceed-
Advanced Computing and Communication Systems, ICACCS, IEEE, 2017, pp. 1–5. ings of the 1st ACM Conference on Advances in Financial Technologies, 2019,
[18] Suyash Gupta, Jelle Hellings, Sajjad Rahnama, Mohammad Sadoghi, An in- pp. 214–228.
depth look of BFT consensus in blockchain: Challenges and opportunities, in: [44] Idit Keidar, Eleftherios Kokoris-Kogias, Oded Naor, Alexander Spiegelman, All
Proceedings of the 20th International Middleware Conference Tutorials, 2019, you need is dag, in: Proceedings of the 2021 ACM Symposium on Principles of
pp. 6–10. Distributed Computing, 2021, pp. 165–175.

12
R. Hao et al. Journal of Systems Architecture 148 (2024) 103087

[45] George Danezis, Lefteris Kokoris-Kogias, Alberto Sonnino, Alexander Spiegelman, [61] Rui Hao, Xiaohai Dai, Weiqi Dai, BitFT: An understandable, performant and
Narwhal and tusk: A dag-based mempool and efficient bft consensus, in: resource-efficient blockchain consensus, IEEE Trans. Sustain. Comput. (2023)
Proceedings of the Seventeenth European Conference on Computer Systems, 1–12.
2022, pp. 34–50. [62] Suyash Gupta, Jelle Hellings, Mohammad Sadoghi, RCC: Resilient concurrent
[46] Alexander Spiegelman, Neil Giridharan, Alberto Sonnino, Lefteris Kokoris-Kogias, consensus for high-throughput secure transaction processing, in: 2021 IEEE 37th
Bullshark: Dag bft protocols made practical, in: Proceedings of the 2022 ACM International Conference on Data Engineering, ICDE, IEEE, 2021, pp. 1392–1403.
SIGSAC Conference on Computer and Communications Security, 2022, pp. [63] Chrysoula Stathakopoulou, Tudor David, Marko Vukolic, Mir-bft: High-
2705–2718. throughput bft for blockchains, 2019, p. 92, arXiv preprint arXiv:1906.
[47] Gabriel Bracha, Asynchronous Byzantine agreement protocols, Inform. and 05552.
Comput. 75 (2) (1987) 130–143. [64] Catalonia-Spain Barcelona, Mencius: Building efficient replicated state machines
[48] Yuan Lu, Zhenliang Lu, Qiang Tang, Guiling Wang, Dumbo-mvba: Optimal multi- for WANs, in: 8th USENIX Symposium on Operating Systems Design and
valued validated asynchronous byzantine agreement, revisited, in: Proceedings of Implementation, OSDI 08, 2008.
the 39th Symposium on Principles of Distributed Computing, 2020, pp. 129–138. [65] Chrysoula Stathakopoulou, Matej Pavlovic, Marko Vukolić, State machine repli-
[49] Diego Ongaro, John Ousterhout, In search of an understandable consensus cation scalability made simple, in: Proceedings of the Seventeenth European
algorithm, in: 2014 {USENIX} Annual Technical Conference, {USENIX} {ATC} Conference on Computer Systems, 2022, pp. 17–33.
14, 2014, pp. 305–319.
[50] Elli Androulaki, Artem Barger, Vita Bortnikov, Christian Cachin, Konstanti-
nos Christidis, Angelo De Caro, David Enyeart, Christopher Ferris, Gennady
Laventman, Yacov Manevich, et al., Hyperledger fabric: A distributed operating Rui Hao received the Ph.D. degree from Nanjing University
system for permissioned blockchains, in: Proceedings of the Thirteenth EuroSys (NJU), Nanjing, China in 2023. She is currently a Post-
Conference, 2018, pp. 1–15. Doctoral Researcher with the School of Computer Science
[51] Loi Luu, Viswesh Narayanan, Chaodong Zheng, Kunal Baweja, Seth Gilbert, and Artificial Intelligence, Wuhan University of Technol-
Prateek Saxena, A secure sharding protocol for open blockchains, in: Proceedings ogy, China. Her research interests include software quality,
of the 2016 ACM SIGSAC Conference on Computer and Communications Security, software security and blockchain.
2016, pp. 17–30.
[52] Jian Liu, Wenting Li, Ghassan O. Karame, N. Asokan, Scalable byzantine
consensus via hardware-assisted secret sharing, IEEE Trans. Comput. 68 (1)
(2018) 139–151.
[53] Johannes Behl, Tobias Distler, Rüdiger Kapitza, Hybster - A highly parallelizable
Xiaohai Dai (Member, IEEE) received the Ph.D degree in
protocol for hybrid fault-tolerant service replication, 2017.
School of Computer Science and Technology from Huazhong
[54] Ethan Buchman, Tendermint: Byzantine Fault Tolerance in the Age of
University of Science and Technology (HUST), Wuhan,
Blockchains (Ph.D. thesis), University of Guelph, 2016.
China, in 2021. He is currently a PostDoc with School of
[55] T.H. Hubert Chan, Rafael Pass, Elaine Shi, Pala: A simple partially synchronous
Computer Science and Technology from HUST. His current
blockchain, 2018, Cryptology ePrint Archive.
research interests include blockchain and distributed system.
[56] Benjamin Y. Chan, Elaine Shi, Streamlet: Textbook streamlined blockchains, in:
His awards include Outstanding Creative Award in 2018
Proceedings of the 2nd ACM Conference on Advances in Financial Technologies,
FISCO BCOS Blockchain Application Contest and Top Ten
2020, pp. 1–11.
in FinTechathon 2019.
[57] Laphou Lao, Xiaohai Dai, Bin Xiao, Songtao Guo, G-PBFT: A location-based
and scalable consensus protocol for IOT-blockchain applications, in: 2020 IEEE
International Parallel and Distributed Processing Symposium, IPDPS, IEEE, 2020,
pp. 664–673.
Xia Xie is a professor at Hainan University in China. She
[58] Yun Wu, Liangshun Wu, Hengjin Cai, Reinforced practical Byzantine fault
received her Ph.D. in computer architecture from Huazhong
tolerance consensus protocol for cyber physical systems, Comput. Commun. 203
University of Science and Technology in 2006. Her research
(2023) 238–247.
interests mainly include data mining and knowledge graph.
[59] Guangquan Xu, Hongpeng Bai, Jun Xing, Tao Luo, Neal N Xiong, Xiaochun
Cheng, Shaoying Liu, Xi Zheng, SG-PBFT: A secure and highly efficient dis-
tributed blockchain PBFT consensus algorithm for intelligent internet of vehicles,
J. Parallel Distrib. Comput. 164 (2022) 1–11.
[60] Wang Zhong, Wenlong Feng, Mengxing Huang, Siling Feng, ST-PBFT: An opti-
mized PBFT consensus algorithm for intellectual property transaction scenarios,
Electronics 12 (2) (2023) 325.

13

You might also like