Searchain Blockchain-Based Private Keyword Search in Decentralized Storage
Searchain Blockchain-Based Private Keyword Search in Decentralized Storage
in Decentralized Storage
Peng Jianga,∗, Fuchun Guob,∗, Kaitai Liangc , Jianchang Laib , Qiaoyan Wena
a State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and
Telecommunications, Beijing, China.
b Institute of Cybersecurity and Cryptology, School of Computing and Information Technology, University of
Wollongong, Australia.
c Department of Computer Science, University of Surrey, UK
Abstract
Blockchain-based distributed storage enables users to share data without the help of
a centralized service provider. Decentralization eliminates traditional data loss brought
by compromising the provider, but incurs the possible privacy leakage in a way that
the supplier directly links the retrieved data to its ciphertext. Oblivious keyword search
(OKS) has been regarded as a solution to this issue. OKS allows a user to retrieve
the data associated with a chosen keyword in an oblivious way. That is, the chosen
keyword and the corresponding ciphertext are unknown to the data supplier. But if the
retrieval privilege is with an authorized keyword set, OKS is unavailable due to one-
keyword restriction and public key encryption with keyword search (PEKS) might lead
to high bandwidth consumption.
In this paper, we introduce Searchain, a blockchain-based keyword search system.
It enables oblivious search over an authorized keyword set in the decentralized stor-
age. Searchain is built on top of a novel primitive called oblivious keyword search
with authorization (OKSA), which provides the guarantee of keyword authorization
besides oblivious search. We instantiate a provably secure OKSA scheme, featured
with one-round interaction and constant size communication cost in the transfer phase.
We apply OKSA and ordered multisignatures (OMS) to present a Searchain protocol,
which achieves oblivious peer-to-peer retrieval with order-preserving transaction. The
analysis and evaluation show that Searchain maintains reasonable cost without loss of
retrieval privacy, and hence guarantees its practicality.
Keywords: Decentralized Storage, Oblivious Keyword Search, Authorization,
Blockchain
∗ Correspondingauthor
Email addresses: [email protected] (Peng Jiang), [email protected]
(Fuchun Guo)
Data storage with encryption is essential for data suppliers to protect their sensi-
tive data from being compromised by network attackers. However, traditional storage
systems (e.g., Google, Dropbox and One drive) need an individual service provider
to transfer and store the encrypted data. Although considering integrity protection or
deduplication [1, 2, 3, 4], these storage systems may suffer from potential security
threats (e.g., malware or man-in-the-middle attacks) due to lack of end-to-end encryp-
tion. Bitcoin [5] has triggered a new trend of decentralized computing. It brings a great
advantage: decentralized control, i.e., no one owns or controls the network. Single
monolithic blockchain technology [6] provides an elegant method to achieve decen-
tralized storage. A blockchain is a list of blocks, each covering the encrypted data with
verifiability in the current transaction and referring back to previous blocks.
Modern storage systems employ blockchain technology and public key encryption
to flexibly share encrypted data, just via a federation of nodes with voting permissions,
that is, a peer-to-peer network [7]. Figure 1 depicts the basic framework of blockchain-
based storage, where a node speaks to the rest of the nodes without a central party
and all of blocks recording transactions are linked to a chain according to their orders.
Such a peer-to-peer storage system removes the reliability to the service provider and
addresses the security shortcomings from network attacks. It has three features, namely
Decentralized control, Immutability (written data is tamper-resistant and the block is
ordered) and Independent ability to create & transfer assets.
In the practical point of view, data search is necessary while the traditional encryp-
tion limits search capabilities. To bridge encryption and search, searchable encryption
[8] is designed to allow search over encrypted data without revealing the keyword.
Searchable encryption can be realized in either the symmetric setting or the asymmet-
ric setting. Although the symmetric searchable encryption (SSE) enjoys better effi-
ciency [9, 10], it suffers from complicated secret key distribution/management in data
sharing stage. To address this issue, Boneh et al. introduced a more flexible primitive,
namely public key encryption with keyword search (PEKS) [11] that enables users to
search data in the asymmetric encryption setting. Although supporting keyword search
2
in the asymmetric encryption setting, PEKS is incompatible with decentralized data
storage, which requires peer-to-peer communication between nodes without the aid of
the server.
Motivation. In the decentralized storage, each node represents a supplier (who speaks
and shares its data with others) or a user (who listens and wants to retrieve data). For
users, privacy is on the top of the priority list [12], since the data retrieval leakage will
reveal their private interests. Users often prefer the retrieval to link no ciphertext, i.e.,
reveal no additional information. To tackle the issue, Ogata and Kurosawa [13] intro-
duced an interesting notion, called oblivious keyword search (OKS). With a two-party
oblivious transfer protocol, OKS allows the user to retrieve the plain data containing
the keyword of his choice while the supplier learns nothing about the chosen search
keyword and the retrieved plain-cipher data. However, in some special databases, such
as databases for commercial secrets and databases for DNA information, data is highly
confidential. In such a scenario, users are with various retrieving rights. That is, the
choice of a keyword must be within an authorized keyword set previously specified
by the supplier and the user. To be precise, if W is the authorized keyword set for a
user, the user can search data associated with any keyword w ∈ W and meanwhile, the
retrieving rights corresponding to all the encrypted data associated with W is granted
by the supplier. The supplier is able to verify that the chosen keyword belongs to W ,
while unknowing what the keyword is.
This paper presents a blockchain-based keyword search called Searchain that aims
to enable oblivious search over conditional keyword privacy in the decentralized stor-
age. Searchain is not a trivial combination of blockchain technique and data retrieval
technology. It builds on top of a new notion named oblivious keyword search with
authorization (OKSA). The core idea of OKSA is to generate the trapdoor based on the
authorized keyword set, the received token as well as the secret key of the supplier, so
that the supplier can only know the search keyword belonging to the authorized key-
word set but cannot distinguish which one it is. OKSA allows Searchain to support
peer-to-peer keyword search and to preserve the retrieval privacy with order by further
combining the ordered multisignatures (OMS).
Naive Solution. OKS supports only one keyword and cannot be directly extended to
the context where a keyword set is needed. A potential approach is that the supplier
encrypts all trapdoors of keywords in W (we assume |W | = n) and runs a 1-out-of-n
oblivious transfer protocol with the user on each encrypted trapdoor. This however
comes to the price of the linear size communication cost between the supplier and each
user. It is obvious that this solution does not scale well due to high bandwidth overhead
(i.e. O(n)).
Our Contributions. To summarize, our contributions are fourfold.
3
• We propose the notion of OKSA, which augments OKS with the idea of embed-
ding an authorized keyword set. It provides authorization and verification for
the search keyword in an oblivious way. We propose a provably secure OKSA
instantiation with constant-size transfer communication.
• We present the design of Searchain by using OKSA. The protocol supports en-
crypted retrieval over an authorized keyword set, while hiding the search key-
word and data. The block, generated by OMS to record the retrieval transaction,
works seamlessly with keyword search and hence reliably preserves the transac-
tion order.
• We evaluate our proposed Searchain protocol and the results show its scalability
and practicality in the decentralized storage.
Differences Between This Work and The Conference Version [14]. In this version,
we start from the decentralized storage, and build a blockchain-based data retrieval
framework with its objectives and threat model. Under the new framework, we present
a Searchain protocol by employing OKSA, which was proposed in [14], and OMS. In
addition, we evaluate the performance of the Searchain protocol by functionality analy-
sis and implementation. So the conference version [14] is just one of four contributions
in this work.
Organization. The rest of this paper is organized as follows. In Section 2, we re-
view some related work. Section 3 and Section 4 describe the Searchain overview and
OKSA, respectively. We present a Searchain protocol from OKSA in Section 5 and
evaluate it in Section 6. The formal security proof of OKSA is given in Section 7.
Finally, we conclude the paper in Section 8.
2. Related Work
4
given. A healthcare chain [21] was constructed to facilitate data interoperability in
health information networks. However, these systems focus on the concepts with the
corresponding frameworks instead of the concrete algorithms to guarantee data utiliza-
tion and data secrecy. Also, the linkable transaction privacy is still a margin in the
blockchain-based retrieval.
Public Key Encryption with Keyword Search. Boneh et al. introduced the notion of
public key encryption with keyword search (PEKS) [11] to address the issue of the
complicated key management in SSE and achieve search over the encrypted data in
the asymmetric setting. Afterwards, combinable multi-keyword search schemes have
been proposed to provide diverse search functionality, such as public-key encryption
scheme with conjunctive keyword search (PECKS) [22, 23, 24] and public key encryp-
tion with temporary keyword search (PETKS) [25]. To improve the keyword privacy,
secure channel free-PEKS (SCF-PEKS) schemes [26] and ciphertext retrieval against
insider attacks (CR-IA) [27] were proposed to resist outsider attacks and insider at-
tacks, respectively, and public key encryption with oblivious keyword search (PEOKS)
[28] was proposed to permit authorized private search. We find that PEKS is unsuitable
in the two-party database operation.
Oblivious Transfer. Originally, the notion of oblivious transfer was introduced by Rabin
[29], which is a two-party protocol between a sender S and a receiver R. S has two
bits and R wishes to get one of them satisfying the followings properties: S does not
know which bit R obtains, and R does not know any information about the bit that he
did not obtain. In an OT system, the most general type is k-out-of-n oblivious transfer
(OTkn ), where S holds n messages and R retrieves k of them simultaneously, such
that S does not know which messages R obtains. There have been many works on
oblivious transfer, such as adaptive oblivious transfer [30, 31], oblivious transfer with
fully simulatable security [32], oblivious transfer with universally composable security
[33], oblivious transfer with access control [34] and priced oblivious transfer [35, 36].
Some proposed OTkn protocols, such as [37, 38], have ideal communication rounds.
Oblivious Keyword Search. Ogata and Kurosawa [13] introduced the notion of oblivi-
ous keyword search to address the user privacy issue in the keyword search, which was
based on a two-party OT protocol between a supplier and a user. Their OKS employed
the blind signature, where the ciphertext is generated with the master secret key of the
supplier (denoted by msk) and some keyword, and each trapdoor is transferred from
the supplier to the user using msk and the keyword token generated by the user. Rhee
et al. [39] presented an oblivious conjunctive keyword search to allow search over
boolean combinations of keywords. Freedman et al. [40] considered privacy concerns
in keyword search using oblivious evaluation of pseudorandom functions. Zhu and Bao
[41] addressed the OKS in the public database by using linear and non-linear oblivious
polynomial evaluation. Camenisch et al. [28] proposed the public key encryption with
oblivious keyword search (PEOKS) to build a public key encrypted database permitting
private information retrieval (PIR), where computationally expensive zero-knowledge
proof (ZKP) was employed.
Ordered Multisignatures. Ordered Multisignatures (OMS) was proposed in [42] to al-
low signers to attest to a common message as well as the order in which they signed. In
5
OMS, a group of signers sequentially form an aggregate by each adding their own sig-
nature to the aggregate-so-far. [43] proposed sequentially aggregate signed data based
on uncertified claw-free permutations in the random oracle model, to minimize the to-
tal amount of transmitted data rather than just the signature length. [44] presented a
practical synchronized aggregate signature scheme without interactive complexity as-
sumption used in [42]. [45] constructed a provable secure OMS scheme in the standard
model, which also improved the efficiency compared with original OMS [42].
3. Searchain Overview
3.1. Architecture
Figure 2 depicts an overview of the Searchain architecture. It includes transaction
nodes with a peer-to-peer structure and a blockchain with all of ordered blocks. Their
functions and the Searchain workflow are described as follows.
• Node. Nodes can share data in a peer-to-peer mode. A node plays the role of
the supplier, the user or verifier. As a user, the node generates the data retrieval
request, and accordingly creates a block and broadcasts it. As a supplier, the
node who owns the data can respond the request to help retrieve the data. As a
verifier, the node collects the unconfirmed block and approves it or not. No third
party controls the data retrieval process.
• Block and Blockchain. A block provides a record for the current data retrieval
information (or we call it as transaction record). The block can be added to
the chain after being approved by most nodes in the network. We note that the
optimization issue on the number of nodes in the block approval is out of the
scope in this work.
Workflow. We assume that the transaction happens between two nodes, i.e., Node A
and Node B, where Node A acts as a data supplier and Node B acts as a user. The goal
of Node B is to retrieve the data owned by Node A. Searchain mainly consists of the
following five phases.
Initialization Transaction. Node A generates some parameters for this transaction,
where the public parameters are public while the secret key is only kept by Node A.
6
Node A negotiates a keyword set with Node B. Node A can control whether Node B
is able to retrieve his data.
Data Sharing. Node A leverages the encryption module to handle the sensitive
plaintext data before sharing with other nodes. Each plaintext data is associated with
its respective keyword. The ciphertext data should provide both search capability and
confidentiality, that is, a valid node can search and access the plain data. The ciphertext
data is transmitted in the public network and available to any other nodes.
Retrieval Request. Node B, who submits a request to Node A for his/her retrieval
target data. To hide the target of Node B, this request should be in an encrypted form,
e.g., the encryption of some keyword. Meanwhile, a block which can record the data
transaction is generated. After that, the request and the block are broadcasted to other
nodes in the network.
Verification. The block needs to be approved for validity before it is added to the
chain with an unchanged order. Node A verifies the validity of the request, that is, the
keyword in this request is in the negotiated keyword set, and thereafter distributes a key
to Node B. During the request verification, decryption key generation and distribution,
Node A learns nothing about the retrieved data.
Data Retrieval. Upon receiving a valid key, Node B can search and access its
interested data with other secret information.
3.2. Objectives
Searchain works with each block being verified publicly. Considering the structure
features of the blockchain-based storage system and the properties of keyword search
over encrypted data, Searchain aims to satisfy the objectives, i.e., Decentralizing, Rule
Independence, Transaction Order-preserving, Secrecy and Retrieval Privacy.
• Retrieval Privacy. Under a case that Node B wants to retrieve data from Node
A, Searchain supports the Node B’s retrieval privacy. That is, Node A does
not know the retrieved plain-cipher data and its associated keyword although
assuring the authorization of this keyword.
7
Supplier User
(W ) (W )
In this section, we build blocks of OKSA, which follows the supplier-user mode in
OKS [13]. We systematically study the keyword authorization problem in the oblivious
keyword search, where the supplier has an agreed authorized keyword set with each
user. In OKSA, the user generates a keyword token for any keyword in the authorized
keyword set and thereafter the supplier generates the trapdoor with the received token,
his secret key and the authorized keyword set. Figure 3 presents the OKSA framework
and its detailed algorithms are described as follows.
8
4.1. Algorithm Definition
Definition 1. An oblivious keyword search with authorization scheme consists of the
following polynomial time randomized algorithms.
Setup. The supplier T takes a security parameter λ and an integer n as input, and
outputs the public parameter pp and the master public/secret key pair (mpk, msk)
to establish the system. Note we assume pp is implicitly included in the following
algorithm. T negotiates a keyword set W with each user, where |W | ≤ n.
Commit. T takes a message mi , a keyword wi and the master public key mpk as input,
and outputs the ciphertext CTi , where each message mi has its own unique keyword
wi . T commits all ciphertexts {CTi } to the user U.
Transfer.
Transfer 1. U → T : U takes the authorized keyword set W , a specified keyword
wi0 ∈ W and the master public key mpk as input, and outputs the keyword
token P(wi0 ), the secret key of the user sk and the proof information for
accountability Σ. Then U sends (P(wi0 ), Σ) to T . Here, P(wi0 ) is com-
puted from sk, wi0 , W, mpk. Σ helps T to verify the accountability, that
is, the received token is used to generate a trapdoor for only one keyword
in the authorized keyword set.
Transfer 2. T : T takes the received keyword token P(wi0 ), the authorized keyword set
W and the master public key msk as input. It verifies the accountability
by checking |P(wi0 )| = 1.
Transfer 3. T → U: Once the verification passes, T outputs a trapdoor T to U.
Transfer 4. U: U takes CTi , T, sk as input and outputs mi if wi = wi0 , otherwise, ⊥.
Correctness. An oblivious keyword search with authorization is correct if the user
obtains the message of his choice when all of entities follow the protocol steps above.
Also, passing the verification of accountability means that the trapdoor generated from
the received token will be for only one specific keyword and this specific keyword is in
the authorized keyword set.
9
• (Accountability.) Given (P(W ), W, sk) satisfying |W | > 1, it is hard to gener-
ate (P(W ), Σ) that passes the verification.
Based on the above requirements, we define the security models via the following
games played between a challenger C and an adversary A. More formally,
User Privacy Game.
Setup. C runs the Setup algorithm to generate mpk and sends it to A.
Challenge. A gives two keywords w0 , w1 to C. C responds by choosing a coin θ ∈
{0, 1}, setting w = wθ and generating (P(w), Σ).
Guess. A outputs θ0 and wins the game if θ0 = θ.
We define A’s advantage as Adv = | Pr[θ0 = θ] − 1/2|.
Definition 2. We say that an OKSA scheme satisfies user privacy if there exists no
probabilistic polynomial time adversary to win the above user privacy game with a
non-negligible advantage.
Indistinguishability Game.
Setup. C runs the Setup algorithm to generate mpk and sends it to A.
Phase 1. A makes the trapdoor query for w and C responds with the trapdoor T .
Challenge. A gives two same length message-keyword tuples (m0 , w0 ), (m1 , w1 ) to
C with the restriction that w0 , w1 have not been issued the trapdoor queries in Phase 1.
C responds the challenge ciphertext CT ∗ for randomly choosing θ ∈ {0, 1}.
Phase 2. A issues more trapdoor queries with the same restriction in Challenge, C
responds as Phase 1.
Guess. A outputs θ0 and wins the game if θ0 = θ.
We define A’s advantage as Adv = | Pr[θ0 = θ] − 1/2|.
Definition 3. We say that OKSA has indistinguishability against chosen keyword at-
tack if there exists no probabilistic polynomial time adversary to win the above game
with a non-negligible advantage.
Accountability Game.
In OKSA, the verification of accountability is to assure that the trapdoor is for only
one authorized keyword. It captures the attack that an adversary A can forge a proof
for a valid keyword token P(W 0 ), where W 0 is a subset of the authorized keyword set
W with 1 < |W 0 | < |W | ≤ n. Here, the validness means that A knows W 0 , W, sk of
computing P(W 0 ).
Setup. C runs the Setup algorithm to generate mpk and sends it to A.
Challenge. A outputs (P(W 0 ), W, W 0 , sk) and 1 for challenge, where P(W 0 ) is
generated from W, W 0 , sk, mpk and |W 0 | > 1.
Win. A outputs (P(W 0 ), Σ) and wins the game if (P(W 0 ), Σ) passes the verification
algorithm.
We define A’s advantage as Adv in computing (P(W 0 ), Σ).
Definition 4. We say that OKSA has accountability if there exists no polynomial time
adversary to win the above game with a non-negligible advantage.
10
4.3. Construction
In this section, we propose an oblivious keyword search with authorization pro-
tocol. Our protocol allows the user to obliviously obtain an authorized trapdoor by
submitting a keyword token adaptively. It features with constant size communication
cost between the supplier and the user. The proposed scheme achieves that T can
generate the trapdoor for any keyword in the authorized keyword set but cannot guess
which one it is. Like OKS, OKSA is played between a supplier T and a user U, and it
consists of three phases: Setup, Commit and Transfer as follows.
Setup. T takes as input a security parameter λ, an integer n. It chooses a bilin-
ear map system PG = (p, G, GT , e) [46] and a cryptographic hash function H :
({0, 1}, GT ) → {0, 1}` . It also randomly selects g, h ∈ G, α, x ∈ Zp and com-
i
putes g α , hi = hα for i = 1, 2, · · · , n. The public parameter is denoted as pp =
(PG, H, g, h), and the master public/secret key pair is
mpk = (g α , h1 , h2 , · · · , hn ) , msk = α.
11
If both equations hold, T accepts the received keyword token is for the
trapdoor for one keyword, and we denote it as |P(wi )| = 1; otherwise,
aborts.
Transfer 3. T → U: Given msk and W , T computes the trapdoor T as
1
wj ∈W (α+wj )
Q
T = P(wi ) .
Correctness. Given the master public secret key pair (mpk, msk) from running Setup
algorithm and token/proof tuple (P(wi ), Σ), the correctness of the accountability is
verified by the following equations.
n−1 n
e (Σ2 , hα ) = e Σ1α , hα = e Σ1 , hα ,
Q α+wi
s (α+wj )
Q
e h, h wi ∈W (α+wi ) = e h wj ∈W,j6=i ,h s = e (P(wi ), Σ1 ) .
Given a ciphertext CTi from running the Commit algorithm, a trapdoor T from
running the Transfer algorithm and the secret key of the user sk, the correctness of
searching and decryption can be verified by
1
s
1s
ri (α+wi ) r
H 0, e (c1i , T ) s
= H 0, e g ,h α+wi
= H (0, e (g, h) i ) = c2i
1
s
1s
ri ri (α+wi )
c3i ⊕ H 1, e (c1i , T ) s
= H (1, e (g, h) ) ⊕ mi ⊕ H 1, e g ,h α+wi
r r
= H (1, e (g, h) i ) ⊕ mi ⊕ H (1, e (g, h) i ) = mi .
4.4. Security
OKSA achieves User Privacy, Indistinguishability and Accountability. The formal
security proof will be presented later in Section 7.
5. Searchain Design
This section shows a Searchain protocol from OKSA and OMS. OKSA allows the
user to obliviously retrieve data from the supplier without a third party. We employ the
OMS into the block generation, which records data sharing information of the current
transaction. All blocks can thereafter be added into the chain, where OMS provides
the attestation for the order of the data retrieval transaction. This chain guarantees the
12
transparency and order of the record. Searchain hides the retrieved plain-cipher data
but verifies the authorization of the keyword in the decentralized storage. The used
notations are summarized in Table 1.
Before presenting the Searchain protocol, we review the OMS scheme [42].
Definition 5. An OMS scheme OMS = (OPg, OKg, OSign, OVf) consists of four
algorithms.
OPg. The parameter generation algorithm returns some global information for the
scheme.
OKg. The key generation algorithm inputs global information and returns a public-
private key-pair (pk, sk).
OSign. The signing algorithm inputs the secret key sk, a message m, an OMS-so-far
sigma and a list of i − 1 public keys L = (pk1 , · · · , pki−1 ), and returns a new OMS
σ 0 , or ⊥ if the input is deemed invalid.
OVf. The verification algorithm inputs a list of public keys (pk1 , · · · , pkn ), a message
m and an OMS σ 0 , and returns a bit.
We intuitively illustrate the Searchain protocol in Figure 4 and describe its details
as follows. In the Searchain protocol, we just consider an independent transaction
between two nodes. We say that the block will be added into the chain once approved
by more than half of nodes in this system. Nodes will adopt the same verification
algorithm for the same block.
13
Node A Node B
Data Sharing: {mi , wi }
For i:
Compute the cipher data CTi ← OKSA Commit,
End For
{CT }
i
−−−−−−→
Retrieval Request:
Choose some keyword wi ,
Compute the request Req ← OKSA Transfer 1,
Compute the block σ 0 ← OMS OSign,
Broadcast (Req, σ 0 ).
(Req,σ 0 )
←−−−−−−
Verification:
Run OMS OVf,
If more than half of nodes approve σ 0 ,
Add σ 0 to the chain;
Run OKSA Transfer 2 ,
If Req is verified,
Compute the key Key ← OKSA Transfer 3.
Key
−−−−−−→
Data Retrieval:
Run OKSA Transfer 4,
Search the target cipher data CTi ,
Decrypt the message mi ← OKSA Transfer 4.
We note that other nodes also verify the block σ 0 . Only when approved by more than half of nodes, this block are denoted
as passing the verification and added into the chain.
14
5.4. Verification
The verification includes transaction validity and the request authentication. During
this phase, Node A executes no decryption operation to catch the keyword wi in the
request.
V.1 Any node runs the OMS OVf algorithm to verify the block. Once more than half
of nodes in the network approve this transaction, the block σ 0 is added to the chain.
V.2 Node A runs the OKSA Transfer 2 algorithm to check whether the request is for
a negotiated keyword.
V.3 Once accepting it, Node A runs the OKSA Transfer 3 algorithm to generate a
key Key to Node B. This key helps Node B to search and access the data associ-
ated with wi from Node A.
U.1 Node B runs the OKSA Transfer 4 to search CTi from all the ciphertexts {CTi }.
U.2 Node B runs the OKSA Transfer 4 to access the data mi for wi .
6. Performance Evaluation
500
Cipher data
Request
400 Key
Bandwidth
300
200
100
0
0 2 4 6 8 10 12
Size of Keyword set
15
Transfer Bandwidth. We test the bandwidth from three kinds of parameters, including
cipher data, request and key and exclude the response bandwidth from other nodes for
block approval. We select a hash function whose output is 64-bits and the plain data
with the same-length string. We vary the size of the negotiated keyword set and test
the corresponding parameter size. The experiment results are shown in Figure 5. It is
clear to see that each parameter is almost of size constant with the increase of the size
of the negotiated keyword set, respectively. Therefore, the bandwidth for transfer is
independent of the keyword set, as well.
Computation Overhead. We measure the computation overhead in different phases.
The detailed settings and necessary assumptions depend on the corresponding phases.
• Data Sharing phase. We test the computation time to generate the cipher data.
Encryption for each message is an independent process, so the measurements
run ten times for ten pre-set messages, respectively. Figure 6 presents the data
encryption speed versus the size of the authorized keyword set. We can observe
that the computation time in Data Sharing phase is independent of the size of the
keyword set.
• Retrieval Request phase. We test the computation time to generate the request
and the block Req, σ 0 . Our measurements rely on that the size of the keyword
set is i in the i-th retrieval transaction and that the size of the keyword set varies
from 1 to 10. We show the measured results in Figure 7. From Figure 7, the
curve of the request generation speed keeps a linear growing trend with the size
of the keyword set.
• Verification phase. We measure the verification speed, which includes the block
approval, the request accountability and key generation. In our experiment, we
make an assumption for block verification, that is, the block is valid as long as
6 nodes approve it. Figure 8 shows the verification speed versus the size of the
keyword set. We can see that this phase costs linear-size-increasing computation
time when the keyword set includes more keywords.
• Data Retrieval phase. We test the time to retrieve one message. As the Searchain
protocol in Section 4, the computation operations should be mainly contributed
by the data search and decryption. The result is presented in Figure 9. When the
size of the keyword set increases, there is no explicit changing engendered in the
time to retrieve the message. This tallies with the Searchain protocol, which has
an independent retrieval algorithm of the keyword set.
7. Security Analysis
7.1. Assumptions
We define two hard problems to provide foundation for the security of OKSA,
i.e., (f, n)-DHE Problem and (f, q)-MSE-DDH Problem. Since (f, n)-DHE Problem
has been proposed and analyzed in [49, 50], we only give its description and omit its
intractability analysis. We refer readers to the corresponding references for details.
16
0.015
0.15
Request Speed
Encryption Speed
0.01
0.1
0.005
0.05
0
0 0 5 10
0 2 4 6 8 10 12
Size of Keyword set Size of Keyword set
0.02 0.02
Verification Speed
Retrieval Speed
0.015 0.015
0.01 0.01
0.005 0.005
0 0
0 5 10 0 5 10
Size of Keyword set Size of Keyword set
Then we introduce a new hard problem named (f, q)-MSE-DDH Problem, which
is slightly modified from MSE-DDH problem while still preserving its hardness. Our
(f, q)-MSE-DDH problem is a special instance of general Diffie-Hellman exponent
assumptions in [46], and its intractability will be analyzed later on.
E = 1,
F = rβq(α).
17
We need to show that F is independent of (D, E), i.e. no coefficients {xi,j } and y1
exist such that F = Σxi,j di dj + Σy1 e1 , where the polynomials di , dj are listed in D
and e1 is listed in E above. By making all possible products of two polynomials from
D which are multiples of rβ to F 0 , we want to prove that no such linear combination
F 0 leads to F ,
where A(α), B(α) are polynomials with degree deg A ≤ n − 2 and deg B ≤ n.
If B(α) 6= 0, we have deg f (α)q(α)B(α) ≥ n. Since deg (q(α) − f (α)A(α)) ≤
n − 1, we have B(α) = 0. We simplify the above equation as f (α)A(α) = q(α), so
f (α)|q(α), which contradicts that f (α) and q(α) are comprime. Therefore, there exist
no coefficients {xi,j }, y1 such that F = Σxi,j di dj + Σy1 e1 holds, (f, q)-MSE-DDH
Problem is intractable.
Theorem 1. The proposed scheme satisfies the unconditional keyword privacy of the
token from the user under the User Privacy game.
P ROOF. Let W be the authorized keyword set and (P(w), Σ) be generated from w =
w0 . We have the keyword token and proof as
αn−1 (α+w0 )
Q α+w0
s (α+wj )
P(w) = h wj ∈W,wj 6=w0 , Σ = Σ1 = h s , Σ2 = h s .
α+w1
For any distinct keyword w1 , let s0 ∈ Zp , we implicitly set s0 = s · α+w 0
. We find that
the keyword tokens are identical, i.e., P(w0 ) = P(w1 ), which can be verified as
s0
Q Q
s wj ∈W,wj 6=w0 (α+wj ) wj ∈W,wj 6=w1 (α+wj )
P(w0 ) = h =h = P(w1 ).
Suppose Σ0 = (Σ01 , Σ02 ). The proofs of accountability are also identical, i.e., Σ1 = Σ01
and Σ2 = Σ02 , which can be verified as
α+w0 α+w1 αn−1 (α+w0 ) αn−1 (α+w1 )
Σ1 = h s =h s0 = Σ01 , Σ2 = h s =h s0 = Σ02 .
18
Theorem 2. The proposed scheme is semantically secure and indistinguishable un-
der the Indistinguishability game in the random oracle model if the (f, q)-MSE-DDH
Problem is hard.
P ROOF. Suppose there exists an adversary A who can break the indistinguishability.
We can construct an algorithm B that solves the (f, q)-MSE-DDH Problem. That is,
given an instance of (f, q)-MSE-DDH Problem and Z ∈ GT , the goal of B is to dis-
tinguish Z = e(g0 , h0 )rq(α) or a random group element in GT . B interacts with A as
follows.
Setup. We assume the universal keyword space as KS = {w1 , w2 , · · · , wn }. B
chooses wθ from KS and its corresponding message
Q is denoted as mθ . It implicitly
sets polynomials f (α) = α + wθ , q(α) = wj ∈KS,wj 6=wθ (α + wj ). It also sets
f (α)q(α) αi f (α)q(α)
g = g0 , h = h0 hi = h0
and computes . The public parameter is
f (α)q(α)
denoted as pp = g0 , h0 , PG . B sends the master public key mpk to A, where
mpk = (g0α , h1 , h2 , · · · , hn ) .
H-Query. B maintains a hash list L(ai , Xi , hi ), which is initially empty. Upon re-
ceiving an H query for (ai , Xi ), if (ai , Xi ) is in the list L, B returns the corresponding
hi to A. Otherwise, B sets the hash value hi as follows.
i
i b0 , if ai = 0,
h = H(ai , Xi ) =
bi1 , if ai = 1,
where bi0 , bi1 are randomly chosen from {0, 1}` . Then B adds (ai , Xi , hi ) to the list and
returns hi to A.
Phase 1. A chooses a keyword set W ⊆ KS, where |W | ≤ n. When asking for the
trapdoor query for a keyword wi ∈ W , A randomly chooses s ∈ Zp as the secret key
sk = s, and sends (wi , s) to B.
• If wi = wθ , abort.
sqi (α)f (α) q(α)
• If wi 6= wθ , B responds T = h0 to A, where qi (α) = α+wi . The
trapdoor can be verified
wj ∈W,j6=i (α+wj )
Q
s
1 sf (α)q(α)
wj ∈W (α+wj ) wj ∈W (α+wj ) sqi (α)f (α)
Q Q
α+wi
T = P(wi ) =h = h0 = h0 .
q (α)f (α) f (α) αf (α)
It is easy to see that h0i can be computed from elements h0 , h0 ,
n−2
α f (α)
· · · , h0 in the instance.
Challenge. A sends two tuples (m0 , w0 ), (m1 , w1 ) to B for challenge, where the
trapdoor for w0 or w1 has not been queried.
• If wθ ∈
/ {w0 , w1 }, abort.
19
• If wθ ∈ {w0 , w1 }, B checks whether (0, Z) and (1, Z) are in the list L. If yes,
obtains the corresponding hash value and denotes them as b∗0 and b∗1 . Otherwise,
B chooses b∗0 , b∗1 ∈R {0, 1}` and sets
Then B adds (0, Z, b∗0 ) and (1, Z, b∗1 ) to the list L. B responds A with the chal-
lenge ciphertext CT ∗ = (c1 = g0r , c2 = b∗0 , c3 = b∗1 ⊕ mθ ).
Theorem 3. The proposed scheme captures the accountability under the Accountabil-
ity game if the (f, n)-DHE Problem is hard.
P ROOF. Suppose there exists an adversary A who can break the security of account-
ability. We construct an algorithm B that solves the (f, n)-DHE Problem. Given a
challenge instance of (f, n)-DHE Problem, B interacts with the adversary as the fol-
lows.
n
Setup. B sets α = a, we have h1 = ha , · · · , hn = ha , which are from the (f, n)-
DHE instance. B chooses a hash function H as in the real scheme and the public
20
parameters can be denoted as pp = (PG, H, g, h). B sends mpk to A, where
mpk = (g a , h1 , h2 , · · · , hn ).
Challenge. The adversary chooses two keyword sets W, W 0 with restriction |W 0 | >
1, |W | ≤ n, and selects a random number s ∈ Zp as the secret key of the user sk = s.
A outputs (P(W 0 ), W, W 0 , sk) and 1 for challenge, where the token is denoted as
Q
s wj ∈W −W 0 (a+wj )
P(W 0 ) = h .
Win. The adversary A outputs (P(W 0 ), Σ) and wins the game if (P(W 0 ), Σ) passes
the verification algorithm.
In this case, the proof for accountability should be denoted as
1 1 n−1
Q Q
0 (a+wj ) n−1 a wj ∈W 0 (a+wj )
Σ = Σ1 = h s wj ∈W , Σ2 = Σa1 = hs .
Then the token and its proof can pass the verification as
n
Q
e (Σ2 , ha ) = e Σ1 , ha , e h, h wi ∈W (a+wi ) = e (P(W 0 ), Σ1 ) .
nomial function with deg f (x) > n. B outputs (f (x), Σ2 ) as the solution to the
(f, n)-DHE Problem. This completes the proof of Theorem 3. Hence we obtain
|W 0 | = 1, |P(W 0 )| = 1.
Acknowledgments.
This work is supported by NSFC (Grant Nos. 61502044), the Fundamental Re-
search Funds for the Central Universities (Grant No. 2015RC23).
21
8. Conclusion
Motivated by the privacy concern of the data retrieval in the decentralized storage,
we proposed Searchain, a blockchain-based keyword search mechanism that aims to as-
sure private search over authorized keywords with unchanged retrieval order. The core
design of Searchain is oblivious keyword search with authorization (OKSA), which
supports keyword authorization. Searchain adopts OKSA to build the retrieval protocol
so that the node who owns the data can verify the authorization of the keyword in the
retrieval request but learns nothing about it. By further employing ordered multisigna-
tures (OMS) into block generation, Searchain remains an ordered retrieval transaction.
We evaluated Searchain by algorithm implementations, where the results showed its
cost-efficiency.
References
22
[11] D. Boneh, G. D. Crescenzo, R. Ostrovsky, G. Persiano, Public key encryption
with keyword search, in: EUROCRYPT 2004, Vol. 3027 of LNCS, Springer,
2004, pp. 506–522.
[12] C. Fan, V. S. Huang, Provably secure integrated on/off-line electronic cash for
flexible and efficient payment, IEEE Trans. Systems, Man, and Cybernetics, Part
C 40 (5) (2010) 567–579.
[13] W. Ogata, K. Kurosawa, Oblivious keyword search, J. Complexity 20 (2-3) (2004)
356–371.
[14] P. Jiang, X. Wang, J. Lai, F. Guo, R. Chen, Oblivious keyword search with autho-
rization, in: ProvSec 2016, Vol. 10005 of LNCS, Springer, 2016, pp. 173–190.
[15] A. E. Kosba, A. Miller, E. Shi, Z. Wen, C. Papamanthou, Hawk: The blockchain
model of cryptography and privacy-preserving smart contracts, in: S&P 2016,
IEEE Computer Society, 2016, pp. 839–858.
[16] M. Crosby, Nachiappan, P. Pattanayak, S. Verma, V. Kalyanaraman, Blockchain
technology, Tech. rep., Sutardja Center, Berkeley, University of California (2015).
[17] Proof of existence, https://proofofexistence.com/.
[18] Filament, https://filament.com/.
[19] Storj, https://storj.io/.
23
[26] H. S. Rhee, J. H. Park, W. Susilo, D. H. Lee, Improved searchable public key
encryption with designated tester, in: ASIACCS 2009, IEEE Computer Society,
2009, pp. 376–379.
[27] P. Jiang, Y. Mu, F. Guo, X. Wang, Q. Wen, Online/offline ciphertext retrieval on
resource constrained devices, Comput. J. 59 (7) (2016) 955–969.
[28] J. Camenisch, M. Kohlweiss, A. Rial, C. Sheedy, Blind and anonymous identity-
based encryption and authorised private searches on public key encrypted data,
in: PKC 2009, Vol. 5443 of LNCS, Springer, 2009, pp. 196–214.
[29] M. O. Rabin, How to exchange secrets with oblivious transfer, Tech. Rep. Tech-
nical Report TR-81, Aiken Computation Laboratory, Harvard University (2005).
[30] C. Chu, W. Tzeng, Efficient k-out-of-n oblivious transfer schemes with adaptive
and non-adaptive queries, in: PKC 2005, Vol. 3386 of LNCS, Springer, 2005, pp.
172–183.
[31] K. Kurosawa, R. Nojima, Simple adaptive oblivious transfer without random or-
acle, in: ASIACRYPT 2009, Vol. 5912 of LNCS, Springer, 2009, pp. 334–346.
[32] J. Camenisch, G. Neven, A. Shelat, Simulatable adaptive oblivious transfer, in:
EUROCRYPT 2007, Vol. 4515 of LNCS, Springer, 2007, pp. 573–590.
[33] M. Green, S. Hohenberger, Universally composable adaptive oblivious transfer,
in: ASIACRYPT 2008, Vol. 5350 of LNCS, Springer, 2008, pp. 179–197.
[34] J. Camenisch, M. Dubovitskaya, G. Neven, Oblivious transfer with access con-
trol, in: CCS 2009, ACM, 2009, pp. 131–140.
[35] W. Aiello, Y. Ishai, O. Reingold, Priced oblivious transfer: How to sell digital
goods, in: EUROCRYPT 2001, Vol. 2045 of LNCS, Springer, 2001, pp. 119–
135.
[36] J. Camenisch, M. Dubovitskaya, G. Neven, Unlinkable priced oblivious transfer
with rechargeable wallets, in: FC 2010, Vol. 6052 of LNCS, Springer, 2010, pp.
66–81.
[37] Y. Chen, J. Chou, X. Hou, A novel k-out-of-n oblivious transfer protocols based
on bilinear pairings, IACR Cryptology ePrint Archive 2010 (2010) 27.
[38] F. Guo, Y. Mu, W. Susilo, Subset membership encryption and its applications to
oblivious transfer, IEEE Trans. Information Forensics and Security 9 (7) (2014)
1098–1107.
[39] H. S. Rhee, J. W. Byun, D. H. Lee, J. Lim, Oblivious conjunctive keyword search,
in: WISA 2005, Vol. 3786 of LNCS, Springer, 2005, pp. 318–327.
[40] M. J. Freedman, Y. Ishai, B. Pinkas, O. Reingold, Keyword search and oblivious
pseudorandom functions, in: TCC 2005, Vol. 3378 of LNCS, Springer, 2005, pp.
303–324.
24
[41] H. Zhu, F. Bao, Oblivious keyword search protocols in the public database model,
in: ICC 2007, IEEE, 2007, pp. 1336–1341.
[42] A. Boldyreva, C. Gentry, A. O’Neill, D. H. Yum, Ordered multisignatures and
identity-based sequential aggregate signatures, with applications to secure rout-
ing, in: CCS 2007, ACM, 2007, pp. 276–285.
[43] G. Neven, Efficient sequential aggregate signed data, in: EUROCRYPT 2008,
Vol. 4965 of LNCS, Springer, 2008, pp. 52–69.
[44] J. H. Ahn, M. Green, S. Hohenberger, Synchronized aggregate signatures: new
definitions, constructions and applications, in: CCS 2010, ACM, 2010, pp. 473–
484.
[45] N. Yanai, M. Mambo, E. Okamoto, An ordered multisignature scheme under the
CDH assumption without random oracles, in: ISC 2013, Vol. 7807 of LNCS,
Springer, 2013, pp. 367–377.
[46] D. Boneh, X. Boyen, E. Goh, Hierarchical identity based encryption with constant
size ciphertext, in: EUROCRYPT 2005, Vol. 3494 of LNCS, Springer, 2005, pp.
440–456.
[47] Z. Liu, X. Huang, Z. Hu, M. K. Khan, H. Seo, L. Zhou, On emerging family of
elliptic curves to secure internet of things: ECC comes of age, IEEE Transactions
on Dependable and Secure Computing 14 (3) (2017) 237–248.
25