0% found this document useful (0 votes)
13 views17 pages

Achieving Secure Verifiable and Efficient Boolean Keyword Searchable Encryption For Cloud Data Warehouse

This paper proposes a secure and verifiable searchable encryption scheme for cloud data warehouses that supports Boolean keyword searches over encrypted data. The approach combines Partial Homomorphic Encryption, B+Tree indexing, inverted indexing, and blockchain technology to enhance search efficiency and user authentication without third-party involvement. Comparative experiments demonstrate the scheme's superior performance compared to existing methods, addressing the limitations of current searchable encryption solutions in handling complex queries over encrypted data warehouses.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views17 pages

Achieving Secure Verifiable and Efficient Boolean Keyword Searchable Encryption For Cloud Data Warehouse

This paper proposes a secure and verifiable searchable encryption scheme for cloud data warehouses that supports Boolean keyword searches over encrypted data. The approach combines Partial Homomorphic Encryption, B+Tree indexing, inverted indexing, and blockchain technology to enhance search efficiency and user authentication without third-party involvement. Comparative experiments demonstrate the scheme's superior performance compared to existing methods, addressing the limitations of current searchable encryption solutions in handling complex queries over encrypted data warehouses.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Received 11 March 2024, accepted 26 March 2024, date of publication 29 March 2024, date of current version 11 April 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3383320

Achieving Secure, Verifiable, and Efficient


Boolean Keyword Searchable Encryption
for Cloud Data Warehouse
SOMCHART FUGKEAW , (Member, IEEE), LYHOUR HAK ,
AND THANARUK THEERAMUNKONG
School of ICT, Sirindhorn International Institute of Technology, Thammasat University, Khlong Nueng, Pathum Thani 12000, Thailand
Corresponding author: Somchart Fugkeaw ([email protected])
This work was supported by Office of the Permanent Secretary, Ministry of Higher Education, Science, Research, and Innovation
(OPS MHESI), Thailand Science Research and Innovation (TSRI) and Thammasat University, under Grant RGNS 65-110.

ABSTRACT Cloud data warehouse (CDW) platforms have been offered by many cloud service providers
to provide abundant storage and unlimited accessibility service to business users. Sensitive data warehouse
(DW) data consisting of dimension and fact data is typically encrypted before it is outsourced to the cloud.
However, the query over encrypted DW is not practically supported by any analytical query tools. The
Searchable Encryption (SE) technique is palpable for supporting the keyword searches over the encrypted
data. Although many SE schemes have introduced their own unique searching methods based on indexing
structure on top of searchable encryption techniques, there are no schemes that support Boolean expression
queries essential for the search conditions over the DW schema. In this paper, we propose a secure and
verifiable searchable encryption scheme with the support of Boolean expressions for CDW. The technical
construct of the proposed scheme is based on the combination of Partial Homomorphic Encryption (PHE),
B+Tree and Inverted Index, and bitmapping functions to enable privacy-preserving SE with efficient search
performance suitable for encrypted DW. To enhance the scalability without requiring a third party to support
the verification of search results, we employed blockchain and smart contracts to automate authentication,
search index retention, and trapdoor generation. For the evaluation, we conducted comparative experiments
to show that our scheme is more proficient and effective than related works.

INDEX TERMS Cloud data warehouse, searchable encryption, Boolean expressions, homomorphic encryp-
tion, blockchain.

I. INTRODUCTION and facts are materialized. One of the common DW mod-


Typically, a data warehouse (DW) serves as the repository els supported by many online analytical processing (OLAP)
for a wide array of sensitive or strategic data, where the tools is cube-based or multidimensional OLAP(MOLAP).
aggregated outcomes are derived from a multidimensional In MOLAP, DW consists of a number of data cubes, where
framework and feature significantly larger data volumes. The each cube represents the pre-computed view of the dimension
cloud data warehouse (CDW) represents a promising plat- and fact data.
form that offers high resource resilience and accessibility for To support analytical queries over encrypted DW, the user
businesses. Since the cloud is honest but curious, data encryp- needs to make a normal query, while the cube result should
tion techniques are generally applied before outsourcing the be returned in an encrypted format. Then, authorized users
data to the cloud. Since the data warehouse is constructed with a key can decrypt and access the plain query result.
based on multidimensional model where multiple dimensions However, this makes it impractical for multiple query results.
Searchable encryption (SE) techniques are viable for support-
The associate editor coordinating the review of this manuscript and ing multiple queries in an efficient manner. SE is a method
approving it for publication was Nitin Gupta . in which keywords are extracted from a data cube, encrypted,
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
49848 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 12, 2024
S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

and uploaded to the cloud. Keywords are shared between data we applied blockchain technology to develop and execute
owners and data users in the secure channel. Once a search smart contracts for enabling search permission and search
query is made, the search function will be performed by result verification. The contributions of this article are sum-
cloud to find a matching keyword from the data user’s request marized as follows:
with the ones stored on the cloud. Some studies [1], [2], [3], 1. We proposed a secure and fine-grained cryptographic-
[4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], based access control scheme with efficient and verifi-
[16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], able searchable encryption for cloud data warehouse.
[27], [28], [29] have proposed a solution to support multiple Our proposed searchable encryption also supports
keyword searches, such as multi-keyword rank searches and Boolean expressions in the search query over encrypted
range searches with the search structure of a normal index, data cubes outsourced in the cloud.
an inverted index, or a Tree index. These works allow users to 2. We introduced a novel design of indexing techniques
input more keywords than the traditional ones, which speeds entailing the optimization of search space with the
up the searching process. Most of the papers in [3], [4], [5], support of range and hierarchical search based on
[6], [10], [11], [12], [13], [15], [16], [17], [22], [24], [26], B+Tree indexing with the association of user role
and [27] introduced their optimized inverted index to support structure. In addition, we applied the inverted index
multiple keyword searches, where the index is listed and and bitmapping to enable fast search for dynamic key-
mapped to each keyword of the encrypted data. word searches and distinct values of the cube data,
Nevertheless, existing SE schemes are not well applicable respectively.
for supporting efficient search over encrypted DW for several 3. We leveraged blockchain technology and smart con-
reasons. First, since the cube is constructed based on multiple tracts to support decentralized and robust user authenti-
dimensions and fact data, the multiple keyword-based SE is cation, efficient indexing and search result verification
not adequate for the search. The Boolean search connect- of OLAP query, eliminating the need for third-party
ing multiple keywords from multiple-dimension data binding involvement in the verification process.
with indexing is required. Second, existing SE schemes usu- 4. We conducted the comparative analysis and experi-
ally rely on a particular search structure for indexing and ments to demonstrate the efficiency of our proposed
a collection of documents as the searching object, which scheme.
are inefficient to apply to encrypted DW. This is because The remaining sections of this paper are organized as fol-
DW has complex data types for each dimension, and any lows: Section II presents related works. Section III describes
indexing must be adaptable to the various data types within the background of materialized view, Paillier encryption,
the warehouse. Finally, most SE solutions allow any users to and blockchain. Section IV presents our proposed scheme.
perform searches over the outsourced data as long as they Section V describes our proposed cryptographic construction.
are legitimate users. However, DW is generally used for Section VI presents security analysis. Section VII discusses
supporting decision-making and the search result over certain the evaluation and experiments. Section VIII concludes the
sets of encrypted cubes should be limited to the specific group paper.
of users who have the right to make a query. Therefore, the
privacy-preserving SE and indexing structure must be tailored II. RELATED WORK
to satisfy this requirement. There are several works that propose the technique of search-
Regarding the search strategies, tree-based indexing tech- able encryption over encrypted data with the support of
niques such as B+Tree, Bitmap can handle more complex multiple keyword searches in various search structures and
queries such as fuzzy words and Boolean expressions. How- functionalities.
ever, implementing such indexing techniques to support a Typically, searchable encryption is based on two encryp-
large number of encrypted cubes together for a secure and tion approaches: symmetric and asymmetric encryption. For
verifiable search in a CDW setting is non-trivial. Various symmetric searchable encryption (SSE), symmetric encryp-
scenarios still present potential threats to search permission tion algorithm such as AES is used to encrypt and decrypt
and the integrity of search results. For example, unauthorized the search keyword. While SSE has been recognized for its
individuals may attempt search queries, or search results efficiency and speed, the cost of key management is high if
could originate from unauthenticated sources or entities lack- there are a large number of users. For asymmetric search-
ing proper permission. The privacy-preserving technique able encryption (ASE), the concept of key pairs is applied
applied for indexing is therefore essential. to the keyword in the way that a public key is used for
In this paper, we have introduced a secure and verifiable encryption and a private key is used for decryption. Various
searchable encryption method with the support of Boolean forms of searchable encryption (ASE) have been examined in
expressions for encrypted data cubes outsourced in the cloud. the underlying research area. For example, a public encryp-
Our proposed SE scheme is based on Partially Homomorphic tion with keyword search (PEKS), utilizes the public key to
Encryption (PHE) to ensure the security of keywords and encrypt keywords extracted from data [1], [26]. Attribute-
three key indexing techniques, including B+Tree, inverted Based Searchable Encryption (ABSE) [6], [13], [14], [15],
index, and bitmapping functions, along with. In addition, [18], [29] involves the assignment of attributes to keyword

VOLUME 12, 2024 49849


S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

indices. These attributes are then matched with user query calculation of stag and xtoken from the search query before
trapdoors to maintain the confidentiality of keywords and the they are compared with the index storing on the blockchain
overall encryption characteristics. Additionally, Ciphertext executed by smart contracts with PTindex. With this on-chain
Policy Attribute-Based Searchable Encryption (CP-ABE-SE) search procedure, smart contracts will check whether all
is a fine-grained and specialized method that adds an addi- x-tokens exist in the BSindex or not with the comparative
tional layer of security and facilitates complex multi-keyword formular. In [27], the authors presented the utilization of Term
searches in queries, as used in the scheme [23]. Frequency-Inverse Document Frequency (TF-IDF) for the
Recent works [5], [22], [27], [36] employed homomorphic purpose of arranging pertinent outcomes. They also incorpo-
encryption to support SE functions. Specifically, both full rated techniques such as locality-sensitive hashing and bloom
homomorphic encryption (FHE) [27] and partially homomor- filters to facilitate a fuzzy keyword search, in addition to
phic encryption (PHE) [5], [22], [36] have been adopted due enhancing the bi-gramme keyword transformation approach.
to their ability to perform operations directly on encrypted While this approach supports Boolean expressions, the accu-
data, eliminating the need for decryption. In the case where racy of search results is lower than that of the systems that
the basic search operations are needed and efficiency is a directly support Boolean expressions.
primary concern, PHE is a better choice. Recently, some SE works [8], [21], [35] integrated
In addition to the cryptographic method used as a core blockchain technology to offer robust search result veri-
construct of SE, indexing search structures can be imple- fication as well as assist the user authentication process.
mented to support efficient search. For instance, the inverted Employing blockchain also provides transaction traceability
index is employed in schemes [3], [4], [5], [10], [12], [13], and tamper resistance properties beneficial for maintaining
[15], [16], [17], [22], [24], [26], [27] which provide specific trustworthy keyword indices for searchable encryption appli-
locations for the search within a dataset. When a user queries cations. In [8], Chen et al. proposed a verifiable searchable
a term, the server promptly references the index, efficiently encryption approach that acquires verification components
locating and retrieving the relevant documents. Basically, the during trapdoor generation from user queries. This trapdoor is
B+Tree is regarded as the suitable indexing tree for hierarchi- generated with authentication properties and is subsequently
cal and range-based data types. It has been utilized in several validated by the blockchain, serving as proof of the hashed
schemes [9], [14], [25], [26], [28]. It supports fast queries keyword. The utilization of blockchain technology guaran-
and dynamic updates, insertion, and deletion, with encrypted tees that search results remain unaltered. In [21], Wang et al.
indices being stored at leaf nodes as seen in the scheme [30], proposed the SE scheme designed to maintain the integrity
[31]. Another function to support the fast retrieval of indexing of medical records. This is achieved through the execution of
searches is bitmapping. It has been utilized in schemes [32], smart contracts, which also serve the dual role of managing
[33], [34] that are efficient for databases with limited distinct access control for encrypted data by checking who can access
values. By transforming data into bit arrays, bitmap indexing and share it.
can substantially reduce search costs. In [35], Rong-Bing et al. proposed the utilization of
To provide more search capability, there are schemes that blockchain technology for ensuring data integrity. This is
can support both multiple keyword and Boolean expres- done by creating an immutable ledger and managing search-
sions [4], [21], [27] which deal with more complexity of the able encryption indexes. This approach not only maintained
index structure and search conditions. the confidentiality and privacy of the data, but it also opti-
In [4], Zheng et al. introduced a system based on the mized search costs over large volumes of search queries and
obfuscating technique and dynamic symmetric searchable data sharing transactions.
encryption that supports a single keyword with Boolean Nonetheless, employing a single indexing technique
queries. The scheme retrieves bitmaps matching the queried directly to support searches across a large number of
keywords with the chosen anonymous parameter k. The client encrypted data cubes is not feasible. This is due to the
then computes the Boolean function on these bitmaps to high search space costs and the complexity of multidimen-
determine the documents’ identifiers that satisfy the Boolean sional data cubes. As a result, a comprehensive approach that
query. In [6], the authors developed encrypted indexes for combines Boolean multi-keyword searches, restricted user
keyword sets associated with the stored data, which allow privilege search spaces, efficient range, and distinct search
the cloud service provider (CSP) to perform searches on structures is promising but poses a real challenge.
encrypted data without ever accessing the plaintext keywords, Our work aims to apply PHE with a combination of
thereby ensuring data confidentially and privacy. Similarly, B+Tree, inverted index, and bitmapping functions while also
in [37], the authors did not mention the use of a standard integrating blockchain technology. This integration enables
search index, but they utilized cryptographic methods to secure, efficient, and verifiable searchable encryption for
ensure keyword searchability in the lightweight public key encrypted data cubes.
SE for mobile devices. In [21], the authors proposed the
technique of three on-chain indexes: EDindex, BSindex, and III. PRELIMINARIES
PTindex. The ED index manages the storage of encrypted This section describes the background of the materi-
data with an inverted index. BSindex is used to support the alized views concept which includes the definition of
49850 VOLUME 12, 2024
S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

multidimensional space and base cube. Then, we briefly


describe the Paillier encryption and blockchain technology.

A. MATERIALIZED VIEWS
In a data warehouse, materialized view (MV) is a
pre-computed view result comprising aggregated and/or
joined data from fact and possibly dimension tables.
In MOLAP, a DW is modelled in a multidimensional space
where multiple dimensions are formed and associated with
the measure attribute. The precomputed view can be cal-
culated from the possible aggregation operations of the
dimensions and measured in a cube.
Definition 1: Multidimensional space: Let  be the space
of all dimensions. For each dimension D,i there exists a set of
levels, denoted as levels (Di ). A dimension is a lattice (H, ≺)
of levels. Each path in the lattice of a dimension hierarchy,
beginning with its least upper bound, and ending with its
greatest lower bound is called a dimension path. For example,
the dimension path [day, week, month, year] is represented as
day≺week≺ month≺ year.
FIGURE 1. Our system model.
Definition 2: Base Cube
A base cube Cb is a 3-tuple< D, L, R> where • P.Encpkp(m): This algorithm is employed to encrypt a
• D= < D1 , D2 , . . . , Dn , M> is a list of dimensions (Di , messagem ∈ n : chooser ∈ and compute [m] = gm ∗
M ∈ ). M is a measure of the cube. r n mod n2 ∈ n2 .
• L =<DL1 ,DL2 ,. . . , DLn , ∗ML> is a list of dimension • P.Decskp ([m]): To decrypt a ciphertext c = [m], this
levels (DLi ., ∗ML ∈ 9). ML is the dimension level algorithm computes m as follows: m = (L(msk )mod n2 /
of the measure of the cube where the measure level L(gsk ) mod n2 ) mod n.
(∗ML) belongs to a set 9. This set represents all possible
measure levels within the data warehouse schema. C. BLOCKCHAIN
• R is a set of cell data formed as a tuple x = (x1 , x2 , . . . , Blockchain technology is an immutable, distributed, trans-
xn , ∗m) where I in [1, . . . , n], xi ∈ dom(DLi ) and ∗m ∈ parent, and traceable ledger that records the provenance of
dom(∗ML). digital data. Its foundation lies in public key encryption and
In our model, we assume that materialized view represents all cryptographic hashing techniques. The digital assets or data
possible views of the base cube Cb . Each view is computed stored within each block maintain their immutability due to
from the set of aggregation operations including {sum, avg, the fact that once a block is finalized, it is hashed and inter-
count, max, min, rank(n)}. Each one of the operations results connected with others in the blockchain network. In a typical
in a new cube c’ or a materialized view (MV). blockchain structure, each block comprises essential ele-
ments, including a cryptographic hash of the preceding block,
a timestamp indicating when the transaction took place,
B. PAILLIER ENCRYPTION [36]
a nonce value, and the transaction data. On the blockchain,
Paillier Encryption (PE) is the probabilistic asymmetric smart contracts, which are self-runnable programmes can be
algorithm for public key cryptography. In PE, the message deployed and operated on a blockchain network.
space M for the encryption is n . N is a product of two large
prime numbers p and q. IV. OUR PROPOSED SCHEME
Let L be defined as L(x) = (X –1)/n. For a message In this section, we present the system model, our pro-
m ∈ n , we denote [m] ∈ n2 to be the encryption of m with posed indexing technique, and the construction of searchable
the public key pk. Particularly, Paillier encryption consists encryption scheme.
of three algorithms P ={P.KeyGen, P.Enc, P.Dec} which are
defined as follows: A. SYSTEM MODEL
• P.KeyGen(1k ): This algorithm is used to generate the We proposed a secure and verifiable searchable encryption
public key. It begins by establishing an RSA modulus for cloud data warehouse. Figure 1 illustrates the system
n = pq of k bits where p and q are large primes such that overview of our proposed scheme.
gcd(pq, (p-1)(q-1)) = 1. Let K = lcm((p-1) (q - 1)) = The system model consists of the following entities.
1 and pick g ∈ ∗n . The public key is the pair pkp = (n, 1. The Private Cloud Service Provider is responsible
g) and the secret is skp = K . for storing the data cube, which is organized using

VOLUME 12, 2024 49851


S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

MOLAP methodology following the ETL process, TABLE 1. Example of A bank loan data cube.
where data is extracted from various sources, trans-
formed, and loaded. The data owners extract keywords
from each data cube (MV) before subjecting them to
encryption via a Paillier cryptographic algorithm. Sub-
sequently, all the encrypted data cubes (Enc_MV) are
transmitted to the proxy server hosted in the public
cloud.
2. Proxy Server is a semi-trusted server located in the
cloud responsible for executing searches and returning
search result indices to the blockchain. Additionally,
it maintains a memory cache for frequently queried data
within a specific timestamp to expedite search retrieval.
3. The Public Cloud Service Provider (Pub_CSP) is
responsible for housing all the components related to
Enc_MV, which is organized in a B+Tree structure to
facilitate rapid searches. Enc_kw, the encrypted key-
words, serves a triple-purpose function: 1) It extends
the leaf nodes of the B+Tree as the parent tree to
enable range and hierarchical searches. 2) It functions
as a database or table for creating an inverted index
for specific keywords. 3) It is used as a large table for
bitmap indexing of distinct keyword values.
4. Blockchain platform serves as the repository for
accessing and searching transaction records. It incorpo- FIGURE 2. A sample of B+tree structure.
rates smart contracts that fulfill various roles, including
storing evidence of keywords, validating user permis- The B+Tree depicted above has a maximum degree of 3,
sions, authorizing search queries to locate the index of and each leaf node corresponds to a unique number of values
Enc_MV related to the keyword and user’s trapdoor, in ascending order, connected by linked pointers. Each leaf
and conducting integrity checks. node possesses a distinct node key number, which is assigned
5. Data Users (DUs) perform an OLAP query or search in ascending order from the smallest to the largest. A parent
the keywords to get a particular Enc_MV. node may share the same unique node key value with one of
its leaf nodes, yet this value essentially serves as an index
B. OUR PROPOSED B+TREE, INVERTED INDEX, AND number delineating the range of its child nodes. For example,
BITMAP INDEXING FOR ENCRYPTED CUBES a parent node with a node key value of 0006 may have child
Our proposed SE method comprises three combinations of nodes with key values of 0005 and 0006. It is important to
indexing and search structures: B+tree, inverted index, and note that each child node maintains a unique node key value,
bitmap index. Each of these structures is designed to han- ensuring a clear and orderly structure within the system. For
dle distinct types of data values associated with individual instance, when a user queries for an amount ‘‘x’’ where x is
dimensions and factual data within the cube. To better grasp less than 8 or greater than 2, the result would be returned from
the concept of the data cube, Table 1 provides an example all leaf nodes where its node key value is range from 3 to 7.
from a bank loan scenario, demonstrating the construction of Additionally, we integrate three indexing search functions for
multidimensional data. each data cube to efficiently retrieve data. These functions
In the context of the multidimensional data cube, as illus- include the B+Tree, which facilitates range or hierarchical
trated in Table 1 above, we construct all data cubes using searches, similar to the parent B+Tree used for searching
the B+Tree data structure. In our design, there are 38,000 within a specific cube. The inverted index is employed for
generated records for all data cubes, and this B+Tree struc- keyword-oriented attributes such as name or campus, and
ture greatly facilitates rapid retrieval, insertion, and deletion the bitmapping function supports searches for distinct values.
of data. In our design, the structure is associated with user Figure 3 illustrates the sub-B+Tree, which is one of the three
privileges, where users can only query the cube that aligns combined search functions, serving as a subset of each leaf
with their role within the system. However, within each data node of the main B+Tree.
cube, there can be thousands of records. The implementation In the initial setup, the parent B+Tree stores an encrypted
of B+Tree search significantly narrows down the search data cube at each leaf node, and our proposed three indexing
space, leading to reduced time consumption when searching search functions are integrated for each cube. Consequently,
for specific records within a data cube. A sample B+Tree when a user submits a query to retrieve records from any
search structure is depicted in Figure 2 below. data cube, the query is divided into various search functions

49852 VOLUME 12, 2024


S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

FIGURE 3. B+Tree and Sub-B+Tree.

FIGURE 5. Bitmapping function.

{‘‘Mary’’: [{1}, {2}, {4}]} and {‘‘Johnson’’: [{1}, {5}]}, and


the result is {‘‘Mary AND Johnson’’: [{1}]} representing the
intersection based on the ‘AND’ operation.
To accommodate limited distinct values with Boolean
operations, we introduce a bitmapping function that also
FIGURE 4. Example of inverted index. supports Boolean expression searches. Figure 5 provides an
example of how the bitmapping function operates.
that are embedded at each leaf node of the parent B+Tree The binary bitmapping function allows for highly efficient
structure. Additionally, we have another search function in searches of any distinct value. As illustrated in Figure 5, the
the form of the inverted index, which is illustrated in Figure 4 result from a user’s query can be quickly identified by map-
below. ping the bit result to the structured documents. For example,
The inverted index proves valuable for attributes with a if the input is Loan_Type B or C, the bitmap value of each loan
focus on keywords. From Figure 4, before we constructed will undergo an OR operation, producing a binary outcome.
the indexing format, we arranged the set of keywords (Set of This outcome will then be assigned to the index location of
KW) associated with 1 ID (Cube ID) per record in a row of an the document according to its ID.
inverted index table. The Keywords (KwN) of each record can From the above three index searching structures, our pro-
also be duplicated for a number of records themselves. Then, posed system can facilitate the search queries quickly and
we formatted the index of each specific keyword (Keyword) effectively because we handle the data types of each record
associated with a list (set of Cube IDs) where a particular key- efficiently, regardless of the query complexity. The user query
word is found in all IDs. For instance, if we have five records will be broken down into 3 phases/functions, starting with
for customer names represented as {ID, LastName, First- B+Tree to handle the range and hierarchical data, inverted
Name} with values {[1, ‘Mary’, ‘Johnson’], [2, ‘Jennifer’, index for the value of attributes, and bitmapping for distinct
‘Mary’], [3, ‘Linda’, ‘Jennifer’], [4, ‘Taylor’, ‘Mary’], values. The system returns the intersection of the output from
[5, ‘Linda’, ‘Johnson’]}, we structure them as follows: those search functions as the final output.
‘Mary’: {‘‘Mary’’: [{1}, {2}, {4}]},
‘Johnson’: {‘‘Johnson’’: [{1}, {5}]}, C. SECURITY MODEL
‘Jennifer’: {‘‘Jennifer’’: [{2}, {3}]}, In this section, we present the security model for our proposed
‘Linda’: {‘‘Linda’’: [{3}, {5}]}, scheme. The security model defines the nature of the adver-
‘Taylor’: {‘‘Taylor’’: [{4}]} sary, their capabilities, and the interactions between the data
The inverted index structure enables the grouping of mul- owner, authorized users, and the adversary within the pro-
tiple IDs into an index, with a dictionary storing those IDs posed scheme. This security model is established according
that share the same string value, regardless of whether it to the following adversarial model.
pertains to LastName or FirstName. When a user queries for • Adversary Set: A ⊆ Aall (A is a subset of all possible
‘Mary’ and ‘Johnson’, we point to the dictionary index of adversaries Aall ).

VOLUME 12, 2024 49853


S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

• Adversary Type: A is a computationally bounded, pas- TABLE 2. Notation.


sive adversary.
• C omputational Bound: The computational capabilities
of Adversary A are bound in a manner preventing them
from solving problems that necessitate both polynomial
space and computational resources
• Active Attacks: A ∩ Active Attacks = 0 (A is limited to
passive attacks and cannot engage in active attacks).

1) SEARCH QUERY MODEL


Adversary’s Capabilities: Adversary A can submit search
queries to the encrypted data and receive corresponding
search results without learning the underlying data. A can also
submit data to the encrypted index.
System Components:
1. Data Owner (DO)
• The data owner encrypts and stores the data using the
Paillier encryption scheme.
• The data owner builds an index for efficient search and
provides authorized users with search capabilities.
For a given keyword and index I:
• DO →(Encrypt) keywordcipher = Paillier(keyword)
• DO →(Index) I(keyword) V. OUR CRYPTOGRAPHIC CONSTRUCTION
2. Authorized Data Users (DUs) have the capability to The section presents the details and analyses of the DW-
perform searches on the encrypted data and retrieve MBSE construction. To ease of explanation, we define the
relevant results without revealing the plaintext data. notations used in our model as shown in Table 2 below.
These users have a secret key for decryption. Our scheme consists of ten major phases: system setup,
• DU →(Search) Results (keywordcipher , q) keyword extraction, keyword encryption, data and keyword
• DU →(decrypt) keywordplain = Paillier −1 (keywordcipher ) structure, user query process, trapdoor generation, search
Security Properties mechanism, blockchain result verification, user decryption,
Confidentiality: and data caching.
• The searchable encryption scheme guarantees the confi-
dentiality of the data. A. PHASE1: SYSTEM SETUP
• A passive adversary should not be able to learn any In this phase, various components are set up, including the
information about the plaintext data from the encrypted generation of public and private keys, a unique user ID for
data, index, or search queries. data user identification, a proof of keyword to be stored
• Formalized: A plaintext on the blockchain, and the configuration of cache memory
Search Privacy: on the proxy server located in the public cloud. While all
cryptographic keys are generated by the Trusted Authority
• Search queries should not reveal any information about
(TA), the remaining tasks are executed by the private cloud,
the search terms or the data being searched.
with the exception of caching, which is managed by the public
• An adversary should not be able to determine which
cloud. The system setup details are provided in Algorithm 1
terms are being searched.
as the following pseudo code:
• Formalized: A Info(queries)
Once the Algorithm 1 is executed, the following system
Index Privacy: components are created:
• The searchable index should not leak information about
• Public Key and Private Key for Paillier Cryptography:
the data or the search terms, even when search queries The public and private keys required for Paillier cryp-
are made. tography are generated and ready for use in the system.
• Formalized: A Info(index)
• Empty Dictionary for Proof of Keywords: An empty
Keyword Privacy: dictionary is set up to store proof of keywords. This
• The scheme ensures the privacy of keywords used in dictionary will be used to securely store keywords on
search queries. the blockchain.
• Even if an adversary observes multiple search queries • Unique ID for Each Data User: A unique identification
with overlapping keywords, they should not be able to (ID) is created for each data user. This ID will help iden-
deduce sensitive information about the data. tify and distinguish individual users within the system.

49854 VOLUME 12, 2024


S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

Algorithm 1 System Setup Algorithm 3 Encrypt and Keywords Forwarding


1: systemSetup(()→public_key, private_key, 1: encrypt_and_send_keywords((K , public_key)
2: userDatabase, proofOfKeyword, cache){ 2: →Encrypted_keywords, Proofs){
3: # Choose two large prime numbers randomly 3: E(K ) ←{}
4: p,q ← while gcd (pq, (p-1)(q-1)) =1 4: for each keyword in K do
5: n ← p × q 5: encryptedKeyword ←
6: λ ← lcm(p-1, q-1) 6: Paillier_Encrypted(keyword, public_key)
7: g ← Random integer in Zn 2 7: E(K ) )[keyword] ←encryptedkeyword
8: µ ← (L(gλ modn2 ))−1 modn 8: proofOfKeyword[keyword] ←
9: return public key (n,q), private key(λ, µ) 9: Hash-SHA256(keyword)
10: public_key, private_key ← Paillier_setup() 10: end for
11: userDatabase ← {} 11: send_to_cloud(E(K ))
12: for each user do{ 12: send_to_Blockchain(proofOfKeyword)
13: userDatabase[user.ID] ← 13: } end
14: {‘‘role’’: user.role,
15: ‘‘public_key’’: user.public_key} Algorithm 4 Strcture Encrypted Keyworod
16: end for
1: structure_keyword((E(K ))→Inverted_index,
17: proofOfKeyword ← {}
2: Bitmap_index, B+Tree) {
18: cache ← {}
3: Inverted_index ←create_Inverted_index(E(K ))
19: } end
4: Bitmap_index ←create_Bitmap_index(E(K ))
5: B+Tree←create_B+Tree(E(K ))
Algorithm 2 Extract Keywords 6: } end
1: Extract_keywords(records → K ){
2: K← {}
3: K.append(records(date[day, month, year])) The inclusion of a ‘‘proof of keyword’’ on the blockchain
4: K.append(records(customer[name, branch, serves the essential purpose of integrity verification during
5: loan_type])) the process of returning search results from the proxy to
6: K.append(records(amount[day, month, year])) the blockchain. It ensures that the search results have not
7: } end been tampered with or altered in any way, allowing for the
validation of data integrity as it moves between different
components of the system.
• Empty Dictionary for Storing Search Result Index:
Another empty dictionary is prepared to store the index D. PHASE4: KEYWORD CONSTRUCT
of search results. This will be utilized in the memory In this stage, the proxy server constructs the Enc_MV based
cache on the proxy server to enhance search efficiency. on B+Tree, where each leaf node of the B+Tree is extended
to support three additional searching functions for Enc_kw.
These components are fundamental to the system’s operation,
The construction process is described in Algorithm 4 as
enabling secure keyword storage, user identification, and
follows:
efficient search result retrieval.
This algorithm outlines the process of constructing a
B+Tree structure for Enc_MV and extending each leaf node
B. PHASE2: KEYWORD EXTRACTION
to support three different searching functions. These func-
In this stage, keywords are extracted from each data cube tions are designed to facilitate various search operations on
done in the private cloud. The keywords are divided based the encrypted data, enhancing search efficiency and accuracy.
on their value type, representing each dimension of the The construction of Enc_MV as a B+Tree with its
multi-dimensional data cube stored in the data warehouse. node-key-value, along with the creation of three additional
The process is detailed in the following pseudo code: structures (inverted index, bitmapping, and sub B+Tree) as
extensions from the leaf nodes of the parent B+Tree, is a
C. PHASE3: KEYWORD ENCRYPTION AND FORWARDING comprehensive approach to organizing and indexing multi-
In this phase, the data owner applied Paillier encryption to dimensional data securely and efficiently. These structures
the extracted keywords. The set of keywords, along with enhance the ability to search for and retrieve data from
their associated Enc_kw and Enc_MV, is then distributed to the encrypted data cube while maintaining data privacy and
various components: the proof of keyword is forwarded to security.
the blockchain, and the encrypted keyword (Enc_kw) and
encrypted data cube (Enc_MV) are sent to the proxy server E. PHASE5: USER QUERY PROCESS
in the public cloud. The detailed algorithm is presented in After the system is fully set up, data users are able to submit
Algorithm 3 as follows: search queries to the blockchain. The blockchain will either

VOLUME 12, 2024 49855


S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

Algorithm 5 Process User Query Algorithm 7 Searching on Cloud


1: process_User_Query((userID, query) → 1: search_Cloud((trapdoor, Inverted_index,
2: (encrypted_search_results or error_message)){ 2: Bitmap_index, B+Tree) → combinedResults) {
3: if NOT user_Identity_Check(userID, 3: results ← {}
4: userDatabase) then 4: results[‘‘Inverted’’] ← search_Inverted_index
5: return ‘‘Unauthorized User’’ 5: (trapdoor, Inverted_index)
6: end if 6: results[‘‘Bitmap’’] ← search_Bitmap_index
7: if query IS_EMPTY then 7: (trapdoor, Bitmap_index)
8: return ‘‘Empty Query’’ 8: results[‘‘B+Tree’’] ← search_B+Tree (trapdoor,
9: end if 9: B+Tree)
10: trapdoor ← generate_Trapdoor (query, 10: combinedResults ←Combine_Results(results)
11: public_key) 11: return combinedResults
12: result ← search_and_verify (trapdoor) 12: } end
13: if result IS_NOT_Verified then
14: return ‘‘Verification Failed’’
15: end if
1) Node Key-Value Search of Parent B+Tree: Initially, the
16: return result
proxy performs a search on the node key-values within
17: } end
the parent B+Tree. This step narrows down the search
space, improving the efficiency of the search operation.
Algorithm 6 Generate Trapdoor 2) Search via Inverted Index: For each expression within
1: generate_Trapdoor(query,public_key→trapdoor){ the query, the proxy utilizes the inverted index to search
2: trapdoor ← for relevant data. This is one of the search functions
3: Paillier_Encrypt(Covert_To_Number(query), supported by the B+Tree structure, allowing for pre-
4: public_key) cise keyword-oriented searches.
5: return trapdoor 3) Search via Bitmapping: The proxy conducts search
6: } end through the bitmapping function. This method supports
distinct value searches and Boolean expression-based
searches, providing flexibility in querying.
grant or deny permission for the search based on several 4) Search via Sub-B+Tree: The proxy also employs the
criteria, including the validity of the user’s ID, the presence sub-B+Tree structure as one of the search functions,
of a non-empty query, and whether any compromises are utilizing it to locate specific data within the leaf nodes
detected during the result verification process. The specifics of the parent B+Tree.
of this access control mechanism are outlined in Algorithm 5, The pseudocode below illustrates the algorithm of our
as follows: searching strategy over encrypted data.
From the algorithm above, the process begins with the Once the index of the search result is obtained, it will be
data user inputting their userID, which is then verified by sent to blockchain to check before forwarding to the data user.
the blockchain to ensure its validity. After successful userID
verification, the user can proceed to enter their search query. H. PHASE8: BLOCKCHAIN RESULT VERIFICATION
If the search query is empty, it will result in an unauthorized This phase involves the blockchain’s execution to check the
search query. Once the search query is authenticated and integrity of the search result, relying on the proof of keyword
authorized, the next step involves generating a trapdoor. The that has been previously stored on the blockchain. This ver-
details of trapdoor generation are presented in Algorithm 6. ification ensures that the data user receives the search result
from a trusted and untampered source. The detailed algorithm
F. PHASE6: TRAPDOOR GENERATION for this verification process is described as follows:
This phase involves the generation of a trapdoor by the This algorithm outlines the process of verifying the
blockchain, utilizing the user’s query, and applying crypto- integrity of the search result by comparing the provided proof
graphic mechanisms. This trapdoor is then forwarded to the of the keyword with the one stored on the blockchain. If the
proxy server to carry out the search. The process is outlined two proofs of keyword match, the search result is considered
in the following pseudo code: trusted; otherwise, it is not trusted. This verification step
ensures that the data user receives reliable and untampered
G. PHASE7: SEARCHING IN CLOUD search results.
In this phase, the proxy server carries out the search operation
based on the trapdoor received from the blockchain. The I. PHASE9: DECRYPTION
search process involves several steps to enhance efficiency The decryption phase is performed by the data user who
and accuracy. These steps are as follows: initiated the search query. This phase involves decrypting

49856 VOLUME 12, 2024


S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

Algorithm 8 Blockchain Result Verification Algorithm 10 User Data Retrieval


1: blockchain_Verify((results, 1: retrieve_Data((userID, trapdoor, cache)→
2: proofOfKeyword)→True/Fasle) 2: Retrieved_data or cache){
3: for each result in results do 3: unique_key ←Concatenate(userID, trapdoor)
4: if NOT proofOfKeyword[result.keyword] 4: if unique_key IN cache then
5: = Hash(result.keyword) 5: cache_result ← cache[unique_key]
6: return False 6: if Time_Since(cached_result.timestamp) <
7: end if 7: 60 minutes then
8: end for 8: return cache_result.data
9: return True 9: end if
10: end 10: end if
11: result ←search_And_Verify (trapdoor)
Algorithm 9 User Data Retireval and Decryption 12: cache[unqiue_key] ←{‘‘timestamp’’:
13: Current_Time(), ‘‘data’’: result }
1: retireve_And-Decrypt_Data((Encrypted_index,
14: return result
2: private_key)→Encrypted_Data_Cube) {
15: } end
3: Decrypted_index ←
4: Paillier_Decrypt(Encrypted_index,private_key)
5: Encrypted_Data_Cube ←
6: Cloud_index(decrypted_index) A. PARTIAL HOMOMORPHIC ENCRYPTION
7: return Encrypted_Data_Cube Let CT be the ciphertext space and PT be the plaintext space.
8: } end Let λ be the security parameter.
Definition 3: Decisional Composite Residuality Assump-
tion (DCRA) is computationally infeasible to distinguish
between a random composite residue x and a random com-
the index of Enc_kw, aiming to achieve backward secu-
posite non-residue y mod n2 , where n is a composite number.
rity and prevent any patterns that might allow the cloud
CPA-Security: An encryption scheme is CPA-secure if an
to understand the keyword from the search query. The
adversary, allowed to conduct polynomial a bunch of encryp-
decryption algorithm is described as follows: Once the
tions of its selection, cannot differentiate the encryption of
data user has successfully decrypted the index of Enc_kw,
one among others.
they can use this decrypted index to directly locate and
Theorem 1: Paillier cryptographic scheme is CPA-secure,
retrieve the targeted Enc_MV (encrypted data cube). This
given that decisional composite residuality assumption
direct access allows the user to obtain the specific data
satisfies.
they were searching for while preserving data privacy and
Proof.
security.
Game 0 (G0 )
In this game, an adversary A chooses two distinct messages
J. PHASE10: USER DATA RETRIEVAL (FOR CACHING)
dm0 , dm1 ∈ PT and distributes them to the challenger C. The
This phase involves the collaboration between the proxy and challenger now selects a random bit rb ∈ {1,0} and forwards
the blockchain to monitor the frequency of search queries the ciphertext CT= Encpk (dmb ) back to A. The adversary A
from a particular user. If the same query is requested more wins if it correctly guesses rb.
than three times, the system will store the index of the search
WinG0 = | Pr rb′ = rb − 0.5|
 
result on the proxy. This caching mechanism aims to avoid (1)
repeating the search process, thereby reducing costs, particu-
larly in terms of time and resource consumption. The details Game 1 (G1 )
of this caching process are described in Algorithm 10 as This game is identical to G0 except that the challenger
follows: selects a random element re’ from CT and forwards this to
Notably, while the system employs caching for impro- A instead of re.
ved efficiency, this caching is set to expire and be removed WinG1 = | Pr rb′ = rb − 0.5|
 
(2)
after 60 minutes. This time limit ensures that the cached
results do not remain accessible indefinitely, and users Reduction to DCRA
are always working with the most current and secure Having an assumption for contradiction that there exists
data. a polynomial-time adversary A that can differentiate G0 from
G1 with the non-negligible advantage e. Then, we construct a
VI. SECURITY ANALYSIS polynomial-time algorithm B that solves the DCRA problem
We analyze the security of our proposed scheme based on the with advantage at least e.
security of Paillier encryption, query verification, secure user For this, B would simulate the challenger C for A and use
authentication, and backward security. A′ s guess to solve the DCRA problem. If A makes a guess of

VOLUME 12, 2024 49857


S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

b correctly, B concludes that given instance was a composite tampered with or attacked. We aim that there is no PPT
residue, otherwise a composite non-residue. adversary can gain information about the data and search
Thus, we get: |WinG0 – WinG1 | ≤ AdvDCRA (λ) queries. The proof is demonstrated using Real/Ideal simula-
If the DCRA problem is hard, then AdvDCRA (λ) is negligi- tion paradigm.
ble, making the Paillier encryption scheme CPA-secure: Basically, our SE scheme is denoted as BSE-CDW
AdvCPA (Encpk , Decsk , λ) ≤ AdvDCRA (λ) (Boolean Keyword Searchable Encryption with Verifiability
This sums up the proof that the Paillier encryption scheme and Traceability for Cloud Data Warehouse). This scheme is
is CPA-secure under the assumption that the Decisional Com- founded upon the utilization of our B+Tree, inverted index,
posite Residuality Assumption (DCRA) holds. Due to the fact and bitmapping search index structures, with PHE serving
that the proof is relied upon the indistinguishability of Games as our underlying security mechanism. Suppose Arepresent
0 and 1 and the reduction to the DCRA problem to establish a stateful challenger, S denote a stateful simulator, and L
the security of the scheme. This completes the formal proof embody a stateful leakage algorithm. These entities are inte-
for Paillier encryption. gral to the assessment of the following probabilistic experi-
Moreover, according to [36], x and y are random numbers, ments: RealBSE−CDW (PHE) and of IdealBSE−CDW (PHE). The
and rn=pq to obtain the ciphertext ct of pt by ct= ypt x rn mode BSE-CDW scheme is designed to provide robust security
rn2 where p and q are two 64-bit large prime numbers for and efficiency in the realm of Boolean keyword search-
Paillier algorithm. For an intercepted ciphertext ct, it is not able encryption, with additional features such as verifiability
possible to reverse generation corresponding plaintext pt, and traceability tailored for Cloud Data Warehouses. The
because it is a problem of computing nth residue classes. The RealBSE−CDW (PHE) model reflects the practical execution
private key is generated according to p and q, which is hard of the scheme, while of IdealBSE−CDW (PHE) model serves
to crack due to its larger prime factors. as a theoretical benchmark, allowing us to gauge the ideal-
In addition to the generic Paillier cryptographic mech- ized performance in a controlled environment. This rigorous
anism and the proof of random prime number above, our approach ensures a comprehensive evaluation of the scheme’s
algorithm 1 demonstrates the uses of Paillier in which the security guarantees, verifiability, and traceability features,
inner properties and parameters indicate the strong random contributing to a thorough understanding of its capabilities
prime number of p and q. p and q are randomly choose from in preserving the confidentiality within the cloud data ware-
gcd (pq, (p-1)(q-1)) and we generate the public key (n,g) house context.
and private key (λ, µ). Each property of each key is gen- In the RealBSE−CDW (PHE) model, a challenger executes
erated differently where private key is mathematically more algorithm I within our proposed system to generate the public
complex. To decrypt the Paillier encryption, computing nth and private key security parameters of the PHE. Simultane-
residue of operations is a must to break the security strength ously, an adversary A selects a data cube or a materialized
for generating the decent private key. view MV and generates a security index Ialong with a valida-
tor π. Then, they are sent to the challenger C. The challenger
B. QUERY VERIFICATION conducts a series of queries, denoted as q, where the number
• Verifiable Search Request of queries is a polynomial. For each query, A receives a
token transmitted from the challenger. This token is obtained
Our proposed system checks the query from user’s request
through the Trapdoor algorithm, specifically algorithm 6
based on the algorithm 5 in the system construction. Within
(PHE(Pub, kw)) → {Tkw}, from the trapdoor generation
the framework of index search via parent B+Tree, only
algorithm. The search result is then obtained through the
authorized users possess knowledge of the B+Tree’s index,
user query process and result verification algorithms (I, Tkw,
where each unique node key value corresponds to a distinct
tq) → {MVnkv,πtc,πq}. In the final step, A returns a bit,
data cube. Specifically, individuals serving for the specific
denoted as b. If b equals 1, the adversary accepts the result;
role is assigned to the unique node key value associated with
otherwise, it rejects. This process represents a comprehensive
the leaf node beneath the parent B+Tree. This design ensures
evaluation of the RealBSE−CDW scheme’s security under the
that the confidentiality of other data cubes, as well as diverse
PHE framework. The challenger’s generation of public and
roles or positions, remains secure against unauthorized access
private keys, the adversary’s selection and validation of a
when users execute queries. It is important to note that the
data cube, and the subsequent query-response interactions
encrypted data cube does not divulge any crucial information
contribute to a robust assessment of the scheme’s resilience
directly to potential attackers. This is attributed to the estab-
against adversarial attempts. This experimental setup ensures
lishment of a secure index by the token within the role-based
a thorough examination of the scheme’s effectiveness in
node key value, situated atop the encryption mechanism of
providing secure and efficient Boolean keyword searchable
the token.
encryption with verifiability and traceability for cloud data
• Verifiable Search Result warehouse.
Our proposed scheme supports the verification of search In the context of IdealBSE−CDW (PHE), an adversary A
results based on the hash proof of keyword which is stored selects a data cube MV. In accordance with the leakage
on the blockchain. Blockchain can verify that the result is function L, the simulator S generates a security index and a

49858 VOLUME 12, 2024


S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

verifier using the Setup (1k ) → {Priv1, Pub1, Priv2, Pub2, specifically a personal trusted ID assigned during Phase 1 of
PHE parameters} algorithm. This information is then trans- system setup. This personal ID is a 12-bit alphanumeric code,
mitted to adversary A for further evaluation. Aperforms a establishing a foundational security requirement. Following
series of queries, denoted as q and belonging to the realm of authentication, the blockchain undertakes a crucial check
polynomial numbers. to determine if the user’s request is empty. In the event of
For each query, the simulator S furnishes A with the token an empty query, the request is promptly terminated, and a
Tkw and the corresponding validator π for A’s response. record is securely stored on the blockchain network. These
Subsequently, Areturns a bit, denoted as b. Upon receiving blockchain records represent trusted transactions, encom-
bequaling 1 signifies the adversary’s acknowledgment of the passing both end-to-end network addresses and a timeline for
simulation; otherwise, it results in rejection. auditors to conduct thorough audits. The identification of an
The critical assessment of BSE-CDW’s L-confidentiality empty query is pivotal as it is indicative of potential guessing
is subject to the existence of a probabilistic polynomial attacks, wherein malicious actors attempt to compromise the
time simulator S for every probabilistic polynomial time authentication system using the ‘‘true equals true’’ method-
adversary A. ology. Furthermore, this aligns with our commitment to
Theorem 2: If there is a simulator capable of emulating confidentiality, preventing unauthorized accessibility to all
the actions of an adversary within a polynomial time frame, data or irrelevant information. This necessity for request
we declare that the BSE-CDW scheme is L-confidential. verification underscores the importance of maintaining the
Proof: We aim to demonstrate the existence of a poly- integrity of our authentication system, ensuring its resilience
nomial time simulator S and a probabilistic polynomial time against potential security threats. The subsequent section will
adversary A, establishing indistinguishability between their delve into the intricacies of the query verification process.
outputs in both the Real and Ideal scenarios. Initially, S
initiates the simulation by creating a secure index I , randomly D. BACKWARD SECURITY
selecting node key pairs, and inserting them into the B+tree. Our proposed scheme achieves backward security in the con-
Simultaneously, S generates a random string π’ of length |π| text of a user’s search query and the handling of encrypted
to serve as a verifier. data cubes. Backward security means that even if an adver-
The proof of keyword is established by hashing tokens sary gains access to system records and operations, they
of selected keywords. Each MV undergoes encryption via cannot infer or understand sensitive information about data
a pseudo-random function, linking it to a unique node key deletions. This strengthens the overall security of the system.
pair value (nkv). The confidentiality of the verifier is secured Our scheme achieves backward security based on the fol-
through MV encryption and nkv, rendering A incapable of lowing mechanisms.
distinguishing (I’,π ′ ) from (I,π). Upon Ainitiating a search, (1) User Search Queries: When a user submits a search
Ssimulates a search token Tkw. Initially, the queried token query with a keyword (kw), the proxy server may record
Tkwundergoes hashing, verifying its existence based on the all operations related to each encrypted data cube whenever
proof of keyword. If the token Tkw queried by A exists insertion or deletion of data associated with the keyword
in I’, Srandomly selects a result path and returns it to A. occurs. This means that the system keeps a record of relevant
A is unable to differentiate between a real token Tkwand operations for auditing and tracking purposes.
a simulated token Tkw. For subsequent queries, if tokens (2) Proxy’s Limited Knowledge: It’s emphasized that the
have been queried before, they remain consistent with their proxy server does not have the ability to understand or learn
previous instances or match the initial token in the simula- about the content of data deletions. This is because all the
tion. Additionally, when Asimulates the update of the token indexes are encrypted, ensuring that the proxy only sees and
Tkw, the updated token becomes Tkw’, and the verifier π ′ records encrypted data.
is set to a random string of the same length as π. For (3) Blockchain’s Role: The blockchain plays a role in
each query, Arandomly selects a string to simulate a search facilitating backward security. It maintains an authentication
token. In the Real game, all tokens undergo encryption via list of users and imposes certain restrictions on user queries
the pseudo-random function F, preventing the adversary A and decryption processes. These restrictions and authentica-
from distinguishing whether the simulated token originates tion mechanisms are designed to enhance the security of the
from RealBSE−CDW (PHE) or IdealBSE−CDW (PHE). This com- system.
prehensive simulation and encryption strategy ensures the The combination of these measures, including the encryp-
seamless integration of Real and Ideal scenarios, validating tion of indexes, user authentication, and query restrictions,
the indistinguishability of their respective outputs. ensures backward security.

C. USER AUTHENTICATION VII. EVALUATION


Authentication serves as the initial security layer in our To evaluate our proposed scheme, we performed the com-
proposed scheme for validating user queries. To initiate parative analysis by comparing the functional features and
the authentication process, the requesting user must input the computation cost of our scheme and three related works
their designated Personally Identifiable Information (PII), supporting searchable encryption in cloud. In addition, we did

VOLUME 12, 2024 49859


S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

TABLE 3. Functionality comparison. TABLE 4. Computation cost comparison.

the experiments to demonstrate the search performance of our


scheme and related works.
Scheme [4] and our scheme share similar computational
costs, with encryption and associated expenses generally
A. FUNCTIONALITY COMPARISON
dependent on the number of attributes and exponentiation in
This section presents a comparison of the features of our pro-
G0 while schemes [8] and [30] deal with the cost of sym-
posed system and related works including [4], [8], and [30].
metric encryption and decryption. Specifically, scheme [4]
Table 3 presents a comparison between our scheme and these
additionally uses multiple XOR operations that correspond
related works across five distinct functions.
to the number of keywords or attributes. In contrast,
As presented in Table 3, all schemes implemented
our scheme incorporates partially homomorphic encryption
lightweight encryption for the extracted keywords. For exam-
(PHE), which is considered lightweight compared to fully
ple, scheme [8], [30] utilized symmetric encryption while
homomorphic encryption (FHE). The cost of generating a
scheme [4] and ours relied on partial homomorphic encryp-
trapdoor does not significantly differ across all schemes.
tion. For the scope of search operations, only scheme [4] and
Given the similar encryption costs, there is a slightly higher
ours support multiple keyword searches and Boolean expres-
computational cost for the trapdoor generation in scheme [8],
sions, while scheme [8] and [30] do not support Boolean
where additional verification processes, are involved in gen-
expressions. Additionally, it’s important to note that only
erating the trapdoor. In terms of search costs, only scheme [8]
the BPVSE scheme [8] and our system utilize blockchain
does not support multiple keywords and Boolean expressions,
technology to enhance the authentication and verification
making it cost-efficient when dealing with single keywords.
processes for both data users and search results. Lastly, our
On the other hand, in scheme [30], the search cost is higher
scheme uniquely supports proxy search caching, a critical
compared to scheme [4] and our scheme, particularly when
feature for rapidly retrieving search results, particularly when
handling a larger number of keywords per document, involv-
there’s a high volume of identical and frequently requested
ing several multiplications in Zp . With regard to search
queries. This feature significantly improves search perfor-
structures, all schemes, except for scheme [8], implement a
mance, especially when dealing with large volumes of cube
B+Tree index search structure to support multiple keyword
data that are frequently accessed.
searches. However, only scheme [4] and our scheme offer
support for both multiple keywords and Boolean expressions,
B. COMPUTATION COST COMPARISON
incurring comparable computational costs. Our scheme has a
This section compares computational cost between our work, slightly higher cost than the scheme [4] due to the integration
scheme [4], [8], and [30] as presented in Table 4. To evaluate of three different indexing search functions. Our scheme is
the cost for computing each property of each scheme, the slightly higher than scheme [4] due to our combination of
following notations are used. three different indexing search functions.
• |A0|: The number of attributes owned by the data owner.
• |AU|: The number of attributes owned by the data user. C. EXPERIMENTAL EVALUATION
• G0 : exponentiation and XOR operations in group G0 . In this section, we conducted experiments to measure the pro-
• G1 : exponentiation in an elliptic curve group. cessing time for data cube generation, encryption, decryption,
• Zp : the group {0, 1, . . . , p-1} with multiplication trapdoor generation, search, and query throughput. In addi-
modulo p. tion, we measured the gas used in executing the smart
• L: the number of iterations in searching for inverted contracts.
index or/and bitmap index. The implementation is done via Python’s Cryptogra-
• B: the logarithm concerning the number of entries in the phy and its standard libraries modules such as random,
B+Tree. hashlib, csv, os, time, concurrent.futures, multiprocessing,
• Esym: Represents the cost of symmetric encryption. pickle, threading, and datetime. Additionally, we employed
• |W |: the average number of keywords per document. third-party libraries such as phe [38] for the Paillier crypto-
• |Q|: the number of keywords in the user’s query. graphic system, web3 [39] for binding Python language with

49860 VOLUME 12, 2024


S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

TABLE 5. Time cost of major operations.

TABLE 6. Processing time computation.

Ethereum. We also used machine learning in Python called


Scikit-learn [40] to stimulate the scheme [30]. The experi- FIGURE 6. Cost of search time comparison.
ments were done on an Intel(R) Xeon(R) E-2336 CPU @
2.9GHz and 16 GB of RAM on a server that is running on the
Ubuntu 20.04 Operating System. We employed the Ethereum
network as the blockchain platform for our simulation and
utilized Solidity to develop the smart contracts. The devel-
opment was carried out on Remix, which is a web-based
Integrated Development Environment (IDE) designed for
the Ethereum network. We utilized Ethereum’s smart con-
tracts as it fully utilizes the implementation of decentralized
access control and transparent auditable operations mecha-
nisms. This could allow for fine-grained control over who
can access, modify, or query the data stored in the cloud
data warehouse, reducing reliance on centralized entities for
access management.
• Performance Analysis
We first did the experiment to measure the cost of major FIGURE 7. System throughput.
operations, including encryption and decryption time (TC ),
trapdoor gen time (TD ), and verification time (TV ) of our Figure 6 presents the search cost produced by these
proposed scheme. Table 5 shows the time used for running schemes.
these operations. Table 6 presents how the time cost for each As shown in Figure 6, schemes [8] and [30] displayed
operation is computed. sensitivity to the number of data records, particularly when
In this paper, we conducted simulations of our proposed the records exceeded 50. In contrast, scheme [4] and our
system to calculate the time required to perform the core system provided relatively constant processing times. To be
functions of our system, such as keyword and search result specific, for data cubes with 500 or more records, our sys-
encryption, trapdoor generation, verification, and search tem outperforms the other works. At this scale, our system
result decryption. As demonstrated in Table 5 , TC represents maintains its superior performance, completing searches in an
the time cost of using Paillier encryption and decryption for average time of 601.37 milliseconds, followed by scheme [4]
keywords and search results, which consistently takes around at 614.15 milliseconds, scheme [30] at 8,100 milliseconds,
202 milliseconds. For encryption and trapdoor generation, the and scheme [8] at 14,841.795 milliseconds. These results
time cost increases with the number of records n and a random confirm the efficiency of our proposed system in handling
value TE . In our scheme, TC only is the time taken to perform large-scale datasets.
decryption, while TD is the time taken to generate a trapdoor • Throughput Measurement
when the user makes a search query. Lastly, the TV is the
In our throughput experiment, we investigated how many
time needed to verify the search result based on a hash- proof
concurrent user requests affect the search output rate, as illus-
comparison.
trated in Figure 7. The x-axis indicates the number of
• Search performance concurrent search requests, which is associated with the num-
We did the experiment to compare the search performance of ber of records in each of the 100 data cubes while the y-axis
our scheme, [4], [8], and [30]. For the test, we varied the num- illustrates the rate of search outputs generated per second.
ber of records contained in the data cube and measured the To assess system throughput, we initially conducted exper-
time used to complete the search process. In our experiment, iments using an Intel Xeon E-2236 processor, with 6 cores
we used Tiny OLAP open-source GitHub [40] to generate the and 12 threads, and a base frequency of 2.9GHz. We exe-
38,000 records for all data cubes. cuted the experiments 20 times and averaged the results for

VOLUME 12, 2024 49861


S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

graphical representation. The initial results, depicted by the TABLE 7. Blockchain cost query cost (Consider gas price per unit =
0.375USD).
green line in Figure 7, indicated that the system achieved
its highest throughput in the simulated environment, reach-
ing nearly 184 queries per second (QPS) when the user
request count reached 50. However, beyond 100 concur-
rent requests, the throughput sharply declined to 138 QPS,
attributed to the exhaustive utilization of server resources.
These findings underscored the practical search performance
of our algorithm, capable of supporting various query func-
tionalities, including Boolean expressions and comparative
operators (<, >, =, !=), facilitating efficient range searches
and timeframes. Our implementation leveraged multithread-
ing processing using Python libraries such as threading and
query handling operations. Consequently, the integration of
ProcessPoolExecutor, enabling fast concurrent queries and
blockchain into our proposed system did not significantly
higher throughput.
impact the overall performance of our scheme. However,
To explore the relationship between computational
it did enhance the trustworthiness of user requests by ensur-
resources and throughput in handling concurrent requests,
ing authentication and validation, as well as preserving the
we conducted additional experiments on an AMD Ryzen 9
integrity of search results obtained from public clouds. In our
5900X processor, equipped with 12 cores, 24 threads, and
system, a proof of stake consensus is used and it involves
a base frequency of 3.7 GHz. The results, as depicted in
validators instead of miners which address the scalability,
Figure 7, demonstrated that higher resources such as CPU led
security, and more dynamic decentralized ecosystem.
to increased throughput and resource utilization. Specifically,
a server with a 27% increase in CPU and RAM exhibited
VIII. CONCLUSION AND FUTURE WORK
a notable rise in throughput ranging from 40% to 70%.
Consequently, data warehouse administrators can evaluate In this paper, we have presented a flexible, verifiable,
the required resources from Cloud Service Providers (CSPs) and secure searchable encryption scheme with support
based on current transaction volumes and projected through- for boolean expression over encrypted data cubes within
put demands. a cloud-based data warehouse. Our scheme enjoys both
security and search performance based on the integration
• Processing Cost occurred in Blockchain of partial homomorphic encryption, inverted index, and
Finally, we evaluate the performance of the smart contracts B+Tree. In addition, we leveraged blockchain technology to
executed using blockchain technology by means of the gas streamline the automation of search permission verification,
cost. In our experiments, we simulated the network gas fees user authentication, and search result validation processes.
required by the blockchain to execute smart contracts. These These tasks are executed in a manner that ensures scala-
contracts serve the purpose of authenticating users, creating bility and immutability. Notably, we have utilized various
trapdoors for individual user queries, and verifying search search function types to suit different data types applicable
results against keyword hashes stored on the blockchain. for searching over multidimentional data, such as inverted
In our experiment, we set the gas limitation to 3000000 and indexes, B+Trees, and bitmapping functions. Another key
set several criteria for different smart contracts. To facilitate advantage of our proposed B+Tree indexing scheme is to
user authentication, we randomly generated 1,000 users, each reduce the search space. Our experiments have demonstrated
with their own distinct userID and password, and subse- that our scheme can significantly save time and resources.
quently verified their queries. In the verification process, The system can also provide reasonable system throughput
we made the assumption that there could be as many as for supporting multiple concurrent OLAP query requests. For
100,000 hashed keywords to be matched against the user’s future works, we will investigate the technique to achieve
query trapdoor. Table 7 shows the estimated gas cost used to fully forward security in supporting the keyword update.
run the smart contracts.
Typically, the gas price denotes the quantity of Ether (ETH) REFERENCES
a user is willing to pay per unit of gas, typically measured in [1] H. Yin, W. Zhang, H. Deng, Z. Qin, and K. Li, ‘‘An attribute-
based searchable encryption scheme for cloud-assisted IIoT,’’ IEEE
‘Gwei,’ where 1 Gwei equals 10−9 Eth and 1Gwei equals to Internet Things J., vol. 10, no. 12, pp. 11014–11023, Jun. 2023, doi:
1 billion Wei. The consumption cost in USD is computed as 10.1109/JIOT.2023.3242964.
the product of the gas used and the gas price, representing the [2] X. Liu, H. Dong, N. Kumari, and J. Kar, ‘‘A pairing-free certifi-
cateless searchable public key encryption scheme for industrial Inter-
actual cost of a transaction or the execution of a smart con- net of Things,’’ IEEE Access, vol. 11, pp. 58754–58764, 2023, doi:
tract. Our analysis reveals that the smart contracts incurred 10.1109/ACCESS.2023.3285114.
relatively low costs for trapdoor generation and verification [3] S. Guo, H. Geng, L. Su, S. He, and X. Zhang, ‘‘A rankable Boolean
searchable encryption scheme supporting dynamic updates in a cloud
processes, with the exception of authentication and autho- environment,’’ IEEE Access, vol. 11, pp. 63475–63486, 2023, doi:
rization, which involved a substantial gas fee due to multiple 10.1109/ACCESS.2023.3284904.

49862 VOLUME 12, 2024


S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

[4] Y. Zheng, R. Lu, J. Shao, F. Yin, and H. Zhu, ‘‘Achieving practical [22] M. Ihtesham, S. Tahir, H. Tahir, A. Hasan, A. Sultan, S. Saeed,
symmetric searchable encryption with search pattern privacy over cloud,’’ and O. Rana, ‘‘Privacy preserving and serverless homomorphic-based
IEEE Trans. Services Comput., vol. 15, no. 3, pp. 1358–1370, May 2022, searchable encryption as a service (SEaaS),’’ IEEE Access, vol. 11,
doi: 10.1109/TSC.2020.2992303. pp. 115204–115218, 2023, doi: 10.1109/access.2023.3324817.
[5] Y. Wang, S.-F. Sun, J. Wang, J. K. Liu, and X. Chen, ‘‘Achiev- [23] Y. Zhang, T. Zhu, R. Guo, S. Xu, H. Cui, and J. Cao, ‘‘Multi-keyword
ing searchable encryption scheme with search pattern hidden,’’ IEEE searchable and verifiable attribute-based encryption over cloud data,’’
Trans. Services Comput., vol. 15, no. 2, pp. 1012–1025, Mar. 2022, doi: IEEE Trans. Cloud Comput., vol. 11, no. 1, pp. 971–983, Jan. 2023, doi:
10.1109/TSC.2020.2973139. 10.1109/TCC.2021.3119407.
[6] J. Li, X. Lin, Y. Zhang, and J. Han, ‘‘KSF-OABE: Outsourced attribute- [24] H. Li, Q. Huang, J. Huang, and W. Susilo, ‘‘Public-key authenticated
based encryption with keyword search function for cloud storage,’’ IEEE encryption with keyword search supporting constant trapdoor genera-
Trans. Services Comput., vol. 10, no. 5, pp. 715–725, Sep. 2017, doi: tion and fast search,’’ IEEE Trans. Inf. Forensics Security, vol. 18,
10.1109/TSC.2016.2542813. pp. 396–410, 2023, doi: 10.1109/TIFS.2022.3224308.
[7] Q. Zhang, S. Wang, D. Zhang, J. Sun, and Y. Zhang, ‘‘Autho- [25] J. Du, J. Zhou, Y. Lin, W. Zhang, and J. Wei, ‘‘Secure and verifi-
rized data secure access scheme with specified time and relevance able keyword search in multiple clouds,’’ IEEE Syst. J., vol. 16, no. 2,
ranked keyword search for industrial cloud platforms,’’ IEEE Syst. J., pp. 2660–2671, Jun. 2022, doi: 10.1109/JSYST.2021.3069200.
vol. 16, no. 2, pp. 2879–2890, Jun. 2022, doi: 10.1109/JSYST.2021. [26] T. Liu, Y. Miao, K. R. Choo, H. Li, X. Liu, X. Meng, and R. H. Deng,
3093623. ‘‘Time-controlled hierarchical multikeyword search over encrypted data
[8] B. Chen, T. Xiang, D. He, H. Li, and K. R. Choo, ‘‘BPVSE: Publicly veri- in cloud-assisted IoT,’’ IEEE Internet Things J., vol. 9, no. 13,
fiable searchable encryption for cloud-assisted electronic health records,’’ pp. 11017–11029, Jul. 2022, doi: 10.1109/JIOT.2021.3126468.
IEEE Trans. Inf. Forensics Security, vol. 18, pp. 3171–3184, 2023, doi: [27] F. Li, J. Ma, Y. Miao, Z. Liu, K. R. Choo, X. Liu, and R. H. Deng,
10.1109/TIFS.2023.3275750. ‘‘Towards efficient verifiable Boolean search over encrypted cloud data,’’
[9] J. Fu, N. Wang, B. Cui, and B. K. Bhargava, ‘‘A practical framework for IEEE Trans. Cloud Comput., vol. 11, no. 1, pp. 839–853, Jan. 2023, doi:
secure document retrieval in encrypted cloud file systems,’’ IEEE Trans. 10.1109/TCC.2021.3118692.
Parallel Distrib. Syst., vol. 33, no. 5, pp. 1246–1261, May 2022, doi: [28] X. Li, Q. Tong, J. Zhao, Y. Miao, S. Ma, J. Weng, J. Ma, and K. R. Choo,
10.1109/TPDS.2021.3107752. ‘‘VRFMS: Verifiable ranked fuzzy multi-keyword search over encrypted
[10] L. Chen, Y. Xue, Y. Mu, L. Zeng, F. Rezaeibagha, and R. H. Deng, data,’’ IEEE Trans. Services Comput., vol. 16, no. 1, pp. 698–710,
‘‘CASE-SSE: Context-aware semantically extensible searchable symmet- Jan. 2023, doi: 10.1109/TSC.2021.3140092.
ric encryption for encrypted cloud data,’’ IEEE Trans. Services Comput., [29] X. Liu, X. Yang, Y. Luo, and Q. Zhang, ‘‘Verifiable multikeyword search
vol. 16, no. 2, pp. 1011–1022, Mar. 2023, doi: 10.1109/TSC.2022. encryption scheme with anonymous key generation for medical Internet
3162266. of Things,’’ IEEE Internet Things J., vol. 9, no. 22, pp. 22315–22326,
[11] R. Zhou, X. Zhang, X. Wang, G. Yang, H.-N. Dai, and M. Liu, ‘‘Device- Nov. 2022, doi: 10.1109/JIOT.2021.3056116.
oriented keyword-searchable encryption scheme for cloud-assisted indus- [30] H. Shen, L. Xue, H. Wang, L. Zhang, and J. Zhang, ‘‘B+ -tree
trial IoT,’’ IEEE Internet Things J., vol. 9, no. 18, pp. 17098–17109, based multi-keyword ranked similarity search scheme over encrypted
Sep. 2022, doi: 10.1109/JIOT.2021.3124807. cloud data,’’ IEEE Access, vol. 9, pp. 150865–150877, 2021, doi:
[12] L. Xue, ‘‘DSAS: A secure data sharing and authorized search- 10.1109/ACCESS.2021.3125729.
able framework for e-Healthcare system,’’ IEEE Access, vol. 10, [31] Y. Zheng, R. Lu, Y. Guan, J. Shao, and H. Zhu, ‘‘Achieving efficient
pp. 30779–30791, 2022, doi: 10.1109/ACCESS.2022.3153120. and privacy-preserving exact set similarity search over encrypted data,’’
[13] Y. Yang, R. H. Deng, W. Guo, H. Cheng, X. Luo, X. Zheng, and C. Rong, IEEE Trans. Dependable Secure Comput., vol. 19, no. 2, pp. 1090–1103,
‘‘Dual traceable distributed attribute-based searchable encryption and own- Mar. 2022, doi: 10.1109/TDSC.2020.3004442.
ership transfer,’’ IEEE Trans. Cloud Comput., vol. 11, no. 1, pp. 247–262, [32] F. Li, J. Ma, Y. Miao, Q. Jiang, X. Liu, and K. R. Choo, ‘‘Verifiable and
Jan. 2023, doi: 10.1109/TCC.2021.3090519. dynamic multi-keyword search over encrypted cloud data using bitmap,’’
[14] P. Zhang, Y. Chui, H. Liu, Z. Yang, D. Wu, and R. Wang, ‘‘Efficient and IEEE Trans. Cloud Comput., vol. 11, no. 1, pp. 336–348, Jan. 2023, doi:
privacy-preserving search over edge–cloud collaborative entity in IoT,’’ 10.1109/TCC.2021.3093304.
IEEE Internet Things J., vol. 10, no. 4, pp. 3192–3205, Feb. 2023, doi: [33] J. Shao, R. Lu, Y. Guan, and G. Wei, ‘‘Achieve efficient and verifi-
10.1109/JIOT.2021.3132910. able conjunctive and fuzzy queries over encrypted data in cloud,’’ IEEE
[15] J. Liu, Y. Li, R. Sun, Q. Pei, N. Zhang, M. Dong, and V. C. M. Leung, Trans. Services Comput., vol. 15, no. 1, pp. 124–137, Jan. 2022, doi:
‘‘EMK-ABSE: Efficient multikeyword attribute-based searchable 10.1109/TSC.2019.2924372.
encryption scheme through cloud-edge coordination,’’ IEEE [34] X. Wang, J. Ma, X. Liu, Y. Miao, Y. Liu, and R. H. Deng, ‘‘For-
Internet Things J., vol. 9, no. 19, pp. 18650–18662, Oct. 2022, doi: ward/backward and content private DSSE for spatial keyword queries,’’
10.1109/JIOT.2022.3163340. IEEE Trans. Dependable Secure Comput., vol. 20, no. 4, pp. 3358–3370,
[16] Q. Liu, Y. Tian, J. Wu, T. Peng, and G. Wang, ‘‘Enabling verifiable and Jul. 2023, doi: 10.1109/TDSC.2022.3205670.
dynamic ranked search over outsourced data,’’ IEEE Trans. Services Com- [35] W. Rong-Bing, L. Ya-Nan, X. Hong-Yan, F. Yong, and Z. Yong-Gang,
put., vol. 15, no. 1, pp. 69–82, Jan. 2022, doi: 10.1109/TSC.2019.2922177. ‘‘Electronic scoring scheme based on real Paillier encryption
[17] G. Liu, G. Yang, S. Bai, H. Wang, and Y. Xiang, ‘‘FASE: A fast and algorithms,’’ IEEE Access, vol. 7, pp. 128043–128053, 2019, doi:
accurate privacy-preserving multi-keyword top-k retrieval scheme over 10.1109/ACCESS.2019.2939227.
encrypted cloud data,’’ IEEE Trans. Services Comput., vol. 15, no. 4, [36] Y. Miao, R. H. Deng, K. R. Choo, X. Liu, and H. Li, ‘‘Thresh-
pp. 1855–1867, Jul. 2022, doi: 10.1109/TSC.2020.3023393. old multi-keyword search for cloud-based group data sharing,’’ IEEE
[18] M. Zeng, H. Qian, J. Chen, and K. Zhang, ‘‘Forward secure public key Trans. Cloud Comput., vol. 10, no. 3, pp. 2146–2162, Jul. 2022, doi:
encryption with keyword search for outsourced cloud storage,’’ IEEE 10.1109/TCC.2020.2999775.
Trans. Cloud Comput., vol. 10, no. 1, pp. 426–438, Jan. 2022, doi: [37] Y. Lu and J. Li, ‘‘Lightweight public key authenticated encryption with
10.1109/TCC.2019.2944367. keyword search against adaptively-chosen-targets adversaries for mobile
[19] Z.-Y. Liu, Y.-F. Tseng, R. Tso, Y.-C. Chen, and M. Mambo, ‘‘Identity- devices,’’ IEEE Trans. Mobile Comput., vol. 21, no. 12, pp. 4397–4409,
certifying authority-aided identity-based searchable encryption framework Dec. 2022, doi: 10.1109/TMC.2021.3077508.
in cloud systems,’’ IEEE Syst. J., vol. 16, no. 3, pp. 4629–4640, Sep. 2022, [38] CSIRO’s Data61, GitHub Repository. (2013). Python Paillier
doi: 10.1109/JSYST.2021.3103909. Library. Accessed: Oct. 24, 2023. [Online]. Available:
[20] P. Chaudhari and M. L. Das, ‘‘KeySea: Keyword-based search with https://github.com/data61/python-paillier
receiver anonymity in attribute-based searchable encryption,’’ IEEE Trans. [39] Ethereum, GitHub Repository. (2013). web3.py. Accessed: Oct. 24, 2023.
Services Comput., vol. 15, no. 2, pp. 1036–1044, Mar. 2022, doi: [Online]. Available: https://github.com/ethereum/web3.py
10.1109/TSC.2020.2973570. [40] scikit-learn, Scikit-learn.org. (2019). Scikit-Learn: Machine Learning
[21] M. Wang, Y. Guo, C. Zhang, C. Wang, H. Huang, and X. Jia, ‘‘MedShare: in Python. Accessed: Oct. 25, 2023. [Online]. Available: https://scikit-
A privacy-preserving medical data sharing system by using blockchain,’’ learn.org/stable/
IEEE Trans. Services Comput., vol. 16, no. 1, pp. 438–451, Jan. 2023, doi: [41] T. Zeutschler. (Jun. 30, 2023). TinyOlap. GitHub. Accessed: Oct. 7, 2023.
10.1109/TSC.2021.3114719. [Online]. Available: https://github.com/Zeutschler/tinyolap

VOLUME 12, 2024 49863


S. Fugkeaw et al.: Achieving Secure, Verifiable, and Efficient Boolean Keyword SE for CDW

SOMCHART FUGKEAW (Member, IEEE) THANARUK THEERAMUNKONG received the


received the bachelor’s degree in management bachelor’s, master’s, and Ph.D. degrees in com-
information systems from Thammasat University, puter science from Tokyo Institute of Technology,
Bangkok, Thailand, the master’s degree in com- in 1990, 1992, and 1995, respectively. He is
puter science from Mahidol University, Thailand, currently a Professor with the School of Infor-
and the Ph.D. degree in electrical engineering mation, Communication and Technology (lCT),
and information systems from The University of Sirindhorn International Institute of Technology,
Tokyo, Japan, in 2017. He is currently an Assis- Thammasat University, Thailand. His current
tant Professor with the Sirindhorn International research interests include data mining, machine
Institute of Technology, Thammasat University. learning, natural language processing, and
His research interests include information security, access control, cloud information retrieval.
computing security, blockchain, big data analysis, and high-performance
computing. He served as a Reviewer for several international journals,
such as IEEE ACCESS, IEEE TRANSACTIONS ON INFORMATION FORENSICS AND
SECURITY, IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, IEEE
TRANSACTIONS ON CLOUD COMPUTING, IEEE TRANSACTIONS ON BIG DATA, IEEE
TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, IEEE TRANSACTIONS
ON NETWORK AND SERVICE MANAGEMENT, IEEE TRANSACTIONS ON PARALLEL
AND DISTRIBUTED SYSTEMS, Computers and Security, IEEE SYSTEMS JOURNAL,
IEEE INTERNET OF THINGS JOURNAL, and ACM Transactions on Multimedia
Computing, Communications, and Applications.

LYHOUR HAK received the bachelor’s degree in


computer engineering from Sirindhorn Interna-
tional Institute of Technology, Thammasat Univer-
sity, where he is currently pursuing the master’s
degree in computer engineering. His research
interests include network security, blockchain, and
information security.

49864 VOLUME 12, 2024

You might also like