0% found this document useful (0 votes)
59 views8 pages

A Multikeyword Ranked Search Technique With Provision For Dynamic Update of Encrypted Documents in Cloud

This document summarizes a research paper titled "A MULTI KEYWORD RANKED SEARCH TECHNIQUE WITH PROVISION FOR DYNAMIC UPDATE OF ENCRYPTED DOCUMENTS IN CLOUD". The paper proposes a search technique that allows searching over encrypted documents stored in the cloud. It addresses the issues of security and privacy when storing data in a public cloud. The technique supports multi-keyword ranked search and dynamic updates to encrypted documents without decryption. It aims to provide an efficient and secure search mechanism for encrypted cloud data.

Uploaded by

Pallavi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views8 pages

A Multikeyword Ranked Search Technique With Provision For Dynamic Update of Encrypted Documents in Cloud

This document summarizes a research paper titled "A MULTI KEYWORD RANKED SEARCH TECHNIQUE WITH PROVISION FOR DYNAMIC UPDATE OF ENCRYPTED DOCUMENTS IN CLOUD". The paper proposes a search technique that allows searching over encrypted documents stored in the cloud. It addresses the issues of security and privacy when storing data in a public cloud. The technique supports multi-keyword ranked search and dynamic updates to encrypted documents without decryption. It aims to provide an efficient and secure search mechanism for encrypted cloud data.

Uploaded by

Pallavi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

International Journal of Computer Engineering and Applications, Volume X, Issue III, March 16

www.ijcea.com ISSN 2321-3469

A MULTI KEYWORD RANKED SEARCH TECHNIQUE WITH


PROVISION FOR DYNAMIC UPDATE OF ENCRYPTED
DOCUMENTS IN CLOUD
Neeshima P.P., Pavitra Shankar Hegde, Poojashree P., Pallavi G.B.
Department of Computer Science Engineering, BMSCE, Bangalore, India

ABSTRACT:
Cloud computing has become popular due to the flexibility in data utilization as well as
reduced cost in data management. Due to its benefits, a large number of data owners inclined
towards cloud. The encryption for confidential data is necessary to protect it from the intrusion
of adversaries. Keyword-based search on plaintext for data utility cannot be applied for
encrypted text. With this requirement, a detailed survey has been carried out about the security
issues of public cloud and Searchable Encryption (SE) schemes such as single keyword and
multi keyword ranked search schemes.
Keywords: Cloud security, searchable encryption techniques, multi-keyword search, ordering
and ranking, dynamic operation

[1] INTRODUCTION
“Cloud-computing” is a term that is becoming a means of delivering any sort of
information technology (IT) components from computing power to computing infrastructure
and applications [1], [2]. It is an emerging style of computing where data, applications and
resources are provided to users as services on the web wherever and whenever they require.
Cloud computing refers to the process of delivering hardware, software, storage and network
as services to users. One of the most appealing factors of cloud computing technology is that
since users avail computing resources as a service, they need not buy or build an IT
infrastructure or even understand the underlying technology of the same. A second factor is
that cloud vendors offer their services on a pay-per-use basis [2], reducing capital expenditures
and operational costs.
Cloud is classified into three major delivery models: Private, Public and Hybrid
Private cloud: A private cloud is one that is created inside a company’s firewall and run by
on-site servers. Here, a business turns its own IT environment into a cloud and uses it to
deliver services to the users. It provides high security and control of individuals’ data, but at a
very exorbitant price.
Public cloud: In a public cloud, an individual or a business, especially small and medium
enterprises, rents the services provided by the cloud and pays for whatever and how much ever

Neeshima P.P. , Pavitra Shankar Hegde, Poojashree P. and Pallavi G.B. 55


A MULTI KEYWORD RANKED SEARCH TECHNIQUE WITH PROVISION FOR DYNAMIC UPDATE OF ENCRYPTED
DOCUMENTS IN CLOUD

they have used. That is, public cloud provides services to its users from a third-party service
provider. It is a shared cloud which can be accessed by anyone. Its services are cost-effective
compared to those offered by a private cloud. Hence, users mostly prefer public cloud to
private cloud for outsourcing their data and applications.
Hybrid cloud: Hybrid cloud combines the best of both private and public cloud.
The capabilities and the possible threats of public cloud are surveyed in detail in the
remainder of the paper. Section 2 discusses the security issues faced by public cloud and a few
theoretical solutions. Section 3 explains various Searchable Encryption techniques proposed
so far and section 4 covers the conclusion.

[2] SECURITY IN A PUBLIC CLOUD


The biggest challenge faced by the public cloud is the maintenance of security of the
data stored. Despite of the various merits of cloud, outsourcing sensitive and confidential
information, for example, financial data and employee records of a company, government
documents, and personal health records, to remote servers brings privacy concerns. Since a
third party stores user data, the user cannot have an idea as to what is going on with his data.
There are chances that the cloud service providers (CSPs) themselves may access users’
information without authorization. Several mechanisms and techniques are implemented in
order to secure user data to the maximum possible extent. Some of them include centralization
of data, offloading of work, efficient credential management, usage of authorization
infrastructures like Kerberos, encryption techniques and authentication processes such as
client and server certificates.
From [3], cloud computing is a very common name amongst the information technology
(IT) enthusiasts. Here cloud computing is described as a means of providing computing as a
service to the public (“utility”). Cloud provides access to computing resources over the
network on-demand. It offers technology in a pay-as-you-use services model. As a result, both
individuals and small and medium enterprises can store, process and manage data
conveniently with minimal expenditure. It also provides efficient and rapid deployment of
computing services and has a very negligible amount of management issues.
Though cloud computing has a lot of advantages and benefits, it suffers from myriad
concerns with respect to security and privacy. In public cloud, users cannot have a control on
the systems that are used for managing their data and documents because public cloud is
administrated by cloud service providers and there are chances that they themselves might
examine users’ confidential data. There is no guarantee that there are no malicious insiders
within the administration as a threat to our data assets. Even if it is guaranteed that there are no
malicious attackers within the CSPs, several external threats such as software bugs, media
failures and malware can cause serious privacy and security threats. There have been several
cases of cloud outages and security breaches on cloud especially on some of the noted cloud
providers. The paper mentions some of them: “Apple’s iPad security breach, Amazon S3’s
downtime and mass email deletions by Gmail”. The paper discusses several security issues on
the public cloud, out of which we concentrate on the issue of “data service outsourcing
security”.

56
International Journal of Computer Engineering and Applications, Volume X, Issue III, March 16
www.ijcea.com ISSN 2321-3469

Users outsource a lot of confidential and personal data and documents on the cloud
which can be easily compromised by intruders. One way to protect data from such intruders is
to encrypt all data before outsourcing to the cloud. But then, once such encrypted data is
downloaded it needs to be decrypted which is really a tedious task and nearly impossible. Also
encryption of data that is meant to be stored on the cloud makes data utilization services such
as a search on plaintext indeed problematic. The paper cites that an effective solution to such a
problem would be developing Searchable Encryption (SE) techniques. Searchable encryption
techniques employ a prebuilt encrypted search index which allows users possessing proper
tokens conduct a search on encrypted data with no need of decrypting any data. Based on the
performance, usability and scalability requirements, several forms of search techniques can be
considered: “similarity-search”, “secure-ranked search”, “secure-multi-keyword-semantic
search” etc. all over encrypted data.
A discussion on creating a secure storage service on a public cloud is made, with
proposals of some architecture that accomplish this task, in [4]. We know that a public cloud is
less secure than a private cloud because user data stored in it is out of his control and has a
large potential of getting compromised. To preserve the “confidentiality”, “privacy” and
“integrity” of data on a public cloud, data must be encrypted and then kept on the cloud. Here,
an attempt is made to design a virtual private service for data storage on the basis of
cryptographic techniques existing then. On this basis, two architectures are proposed, one for
the consumer and the other for an enterprise. In both of these, at the core level, what happens
is, data which is to be sent to the cloud is processed first using a data processor. Then, once it
is sent and stored in the cloud, it is checked for any cases of tampering using a data verifier.
There is a token generator for generating tokens using which the CSP can retrieve user data
and a credential-generator that formulates its own “access-control-policy” using which other
users can access the encrypted data and decrypt it.
But a main drawback of this scheme is that, as far as data usability is concerned it would
cause a huge cost because whoever wants to retrieve some data has to download the whole
data first and then decrypt the whole thing locally. This is not feasible and practical.
C. Gentry, in [5], proposes a fully homomorphic encryption scheme, which allows the
computation of arbitrary functions over encrypted data without the decryption key. It says that
one can efficiently compute a compact cipher text that encrypts for any efficiently computable
function. But this scheme cannot be implemented because of the high computational overhead
that it causes for the cloud user and the cloud server.
To overcome such problems, certain practical solutions such as searchable encryption
(SE) schemes were proposed in the later years. These schemes are constructed using either
public key cryptography or symmetric key cryptography. D. Boneh et al., in [6], suggest a
searchable encryption scheme based on public key cryptography. The authors propose a
mechanism that enables the sender to provide a key to a gateway (or router) that enables the
gateway to test whether the searched word is a keyword without learning anything else about
the sent data. This mechanism is referred to as “Public Key Encryption with keyword Search”.
But, due to the computation cost of public key encryption in practical applications, this
mechanism is applicable for searching on a small number of keywords. Many practical
symmetric key based searchable encryption schemes have also been proposed in the recent
years. Some of them are discussed in the next section.

Neeshima P.P. , Pavitra Shankar Hegde, Poojashree P. and Pallavi G.B. 57


A MULTI KEYWORD RANKED SEARCH TECHNIQUE WITH PROVISION FOR DYNAMIC UPDATE OF ENCRYPTED
DOCUMENTS IN CLOUD

[3] SEARCHABLE ENCRYPTION TECHNIQUES


[3.1] SINGLE KEYWORD SEARCH
[3.1.1] Similarity based Search
M. Kuzu et al., in [7], proposed a similarity based search scheme which retrieves data
according to similarity in a specific feature instead of exact query matching. Secure LSH
(locality sensitive hashing) is used for speed similarity search in multi-dimensional spaces for
encrypted data.
Fuzzy keyword search proposed by Q Wang et al. in [8] enhances system efficiency by
returning the matched files when user’s searching keyword exactly matches with the
predefined keywords or the closest possible matched files based on keyword similarity
semantics, when the exact matching fails. In both schemes there is tolerance for minor typos
and formatting inconsistencies which are typical user searching behavior and happen quite
frequently.
However, they won’t employ the rank ordered search and the server bandwidth is
wasted, network traffic is increased. It supports only single keyword search. But for the large
number of data users and the documents, multiple keywords must be allowed in the search
request.
[3.1.2] Conjunctive keyword search
P. Golle et al. have proposed a scheme in [9], where search and retrieval of data in
documents is done using each of the several keywords provided by the user.
In one of the techniques proposed, the server will be provided with the additional
capabilities individually for each keyword and then performs the intersection of all the
matched sets. The communication cost per query is maintained linear with the number of
documents stored and communication can be done offline also. In the other technique, a meta-
keyword is defined for every possible conjunction of keywords. However it leads to an
exponential blow-up in the amount of data that needs to be stored in the server.
But the proposed techniques suffer with security issues as the server may learn the
confidential information from the capability it is supplied with. And the complete solution to
Boolean search problem is given only by the disjunction search which is not proposed in this
scheme.
[3.1.3] Rank-ordered search
A. Swaminathan et al. in [10], S. Zerr et al. in [11] and C. Wang et al. in [12] have
proposed the scheme which returns rank-ordered relevant data from a set of query terms sent
to the server using scoring and the order-preserving encryption schemes. Only the top-ranked,
user desired documents are retrieved and thus reducing the network traffic and bandwidth. The
term-frequency is used to build indices, which is then made secure to protect against statistical
attacks. But, for the large number of data users and the documents, multiple keywords must be
allowed in the search. Also dynamic data updating is not proposed in this scheme.

[3.1.4] Parallel and dynamic search


In [13], S. Kamara et al. have proposed a new, improved SSE (symmetric searchable
encryption) scheme that yields efficient, dynamic search results in sub-linear search time. A
KRB tree which is a new tree-based multi-map data structure can efficiently index a document

58
International Journal of Computer Engineering and Applications, Volume X, Issue III, March 16
www.ijcea.com ISSN 2321-3469

so that searching occurs in optimal time and updates in logarithmic time. The updates of the
scheme do not leak/provide any information about the keywords contained in a newly deleted
or added document apart from information which is leaked through search tokens. Hence it
provides privacy and security. Earlier, SSE schemes proposed by T. Roeder et al. in [14] used
inverted index encryption scheme which was not suited to handle dynamic collections
(construction is complex). Also update operations used to reveal a non-trivial amount of
information and searching was done sequentially. The newly proposed improved SSE removes
all the above disadvantages. Though it provides improvised schemes it is applicable only for
single keyword search and hence not practical.
[3.1.5] Index based search
E.J. Goh et al., in [15], have proposed a secure index based search technique. A secure
index is a data structure using which a user can produce a trapdoor for searching target word.
The author has presented a new security model for indexes called as semantic security against
adaptive chosen keyword attack. This technique guarantees that the information contained in
the documents are not revealed from its indexes or indexes of other data documents unless
valid trapdoor is generated. Thus secure indexes are advantageous over the hash tables as they
do not leak any information about the data documents.
But, secure indexes do reveal information such as document size. The second
disadvantage is that this is a suitable search only in multi-user environment where the updating
to the encrypted documents and indexes are performed frequently. For a single user
environment hash table with pointers and search term pair is suitable.

[3.2] MULTI KEYWORD SEARCH


[3.2.1] Similarity based ranking search
W Sun et al. in [16] and N. Cao et al. in [17] provide multi-keyword search scheme in
order to retrieve the most relevant documents from the user. Vector space model being the
most popular one which supports both conjunctive search and disjunctive search is used in
here, document rankings are obtained by comparing the deviation of angles, i.e., cosine values,
between the document vector (containing normalized TF values) and the query vector
(containing normalized IDF values of query keywords in documents). SIS (secure index
scheme) is used to divide the document vector into several sub-document vectors which are
part of index trees at different levels for efficient searching. To provide better security it
proposes a randomization (phantom terms) approach to prevent information leakage of
sensitive information and hence achieve better privacy of keywords. EMTS (enhanced multi
keyword search) which provides better security than BMTS from known Background model is
proposed for multi- keyword text search and has a comparable search time with BMTS.
Although the author has presented an idea to perform multi keyword secure ranked search
provides better than linear search efficiency, results in precision loss. Also user needs to find
appropriate tradeoff between search precision and privacy.
B. Wang et al. proposed a fuzzy search scheme in [18] which uses algorithmic
design(LSH and Bloom filters) rather than expanding index file and hence achieves high
efficiency in terms of computation and storage. Also it eliminates the need for a predefined
keyword dictionary.

Neeshima P.P. , Pavitra Shankar Hegde, Poojashree P. and Pallavi G.B. 59


A MULTI KEYWORD RANKED SEARCH TECHNIQUE WITH PROVISION FOR DYNAMIC UPDATE OF ENCRYPTED
DOCUMENTS IN CLOUD

Minhash based privacy-preserving multi-keyword search method which arrives at the


similarity between private data and encrypted query using a similarity function is proposed by
C. Orencik et al. in [19]. TF ×IDF weighting factor is used to measure the importance of a
search term within a document over a collection of documents and hence provides appropriate
ranking. Given a query, server compares the query with the searchable index tree generated
and returns the results without learning any other information. While the scheme aims at
solving certain fundamental issues in searching for data in cloud Minhash algorithm do not
provide exact ranking.
However, all the above schemes do not provide dynamic update on documents.
[3.2.2] Predicate based search
J.Katz et al., in [20], present the predicate encryption technique which is widely used in
public key encryption and identity based encryption. Predicate encryption scheme is a
technique where secrete key is associated with predicates which are Boolean functions from
particular set and cipher text corresponds to attributes which are taken from some other set.
Then cipher text is decrypted if and only if the value obtained by mapping predicate set value
to attribute set value equals unity. In predicate encryption technique, it can be tested whether
certain predicates are satisfied or not and thus can serve the purpose of secure searching
without learning anything about the original message.
But in this scheme rank ordered search is not employed and hence the server bandwidth
is wasted, network traffic is increased. And also the dynamic updating to the stored document
is also not supported.
A search scheme that is multi-keyword, ranked as well as dynamic that works on
encrypted data is proposed by Zhihua Xia et al. in [21] which overcomes the drawbacks that
had been discussed in the other schemes. The idea is to combine the “vector space model” and
the “TF×IDF model” for index construction and query generation. To provide efficiency in the
search, a special tree-based index structure and a greedy Depth First Search algorithm on the
indexed tree are constructed. This would result in a sub-linear search time. They propose kNN
algorithm for the encryption of the index and query vectors, and for the calculation of
relevance score for ranking.

[4] CONCLUSION
A survey on various security issues of the public cloud and several techniques proposed
to overcome them has been analyzed. Based on the limitations or drawbacks or improvements
suggested in the above proposed schemes, we have come out with an idea of implementing a
generic SE scheme that could combine the multi-keyword ranked and dynamic properties
proposed separately in certain aforementioned schemes. Experiments could be conducted and
their results would further demonstrate the efficiency of the proposed scheme. We intend to
implement our idea using AES encryption algorithm as part our future work.

REFERENCES
[1] “Cloud computing”, https://en.wikipedia.org/wiki/Cloud_computing
[2] “Cloud computing definition”, http://searchcloudcomputing.techtarget.com/definition/cloud-
computing

60
International Journal of Computer Engineering and Applications, Volume X, Issue III, March 16
www.ijcea.com ISSN 2321-3469

[3] Kui Ren, Cong Wang, and Qian Wang “Security challenges for the public cloud,” IEEE Internet
Computing, vol. 16, no. 1, pp. 69–73, 2012.
[4] S. Kamara and K. Lauter, “Cryptographic cloud storage,” in Financial Cryptography and Data
Security. Springer, 2010, pp. 136–149.
[5] C. Gentry, “A fully homomorphic encryption scheme,” Ph.D.dissertation, Stanford University, 2009.
[6] D. Boneh, G. Di Crescenzo, R. Ostrovsky, and G. Persiano, “Public key encryption with keyword
search,” in Advances in Cryptology-Eurocrypt 2004. Springer, 2004, pp. 506–522.
[7] M. Kuzu, M. S. Islam, and M. Kantarcioglu, “Efficient similarity search over encrypted data,” in
Data Engineering (ICDE), 2012 IEEE 28th International Conference on. IEEE, 2012, pp. 1156–1167.
[8] J. Li, Q. Wang, C. Wang, N. Cao, K. Ren, and W. Lou, “Fuzzy keyword search over encrypted data
in cloud computing,” in INFOCOM, 2010 Proceedings IEEE. IEEE, 2010, pp. 1–5.
[9] P. Golle, J. Staddon, and B. Waters, “Secure conjunctive keyword search over encrypted data,” in
Applied Cryptography and Network Security. Springer, 2004, pp. 31–45.
[10] A. Swaminathan, Y. Mao, G.-M. Su, H. Gou, A. L. Varna, S. He, M. Wu, and D. W. Oard,
“Confidentiality-preserving rank-ordered search,” in Proceedings of the 2007 ACM workshop on
Storage security and survivability. ACM, 2007, pp. 7–12.
[11] S. Zerr, D. Olmedilla, W. Nejdl, and W. Siberski, “Zerber+ r: Topk retrieval from a Confidential
index,” in Proceedings of the 12th International Conference on Extending Database Technology:
Advances in Database Technology. ACM, 2009, pp. 439–449.
[12] C. Wang, N. Cao, K. Ren, and W. Lou, “Enabling secure and efficient ranked keyword search
over outsourced cloud data,” Parallel and Distributed Systems, IEEE Transactions on, vol. 23, no. 8, pp.
1467–1479, 2012.
[13] S. Kamara and C. Papamanthou, “Parallel and dynamic searchable symmetric encryption,”
inFinancial Cryptography and Data Security. Springer, 2013, pp. 258–274.
[14] S. Kamara, C. Papamanthou, and T. Roeder, “Dynamic searchable symmetric encryption,” in
Proceedings of the 2012 ACM conference on Computer and communications security. ACM, 2012, pp.
965–976.
[15] E.-J. Goh et al., “Secure indexes.” IACR Cryptology ePrint Archive, vol. 2003, p. 216, 2003.
[16] W. Sun, B. Wang, N. Cao, M. Li, W. Lou, Y. T. Hou, and H. Li, “Privacy-preserving multi-
keyword text search in the cloud supporting similarity-based ranking,” in Proceedings of the 8th ACM
SIGSAC symposium on Information, computer and communications security. ACM, 2013, pp. 71–82
[17] N. Cao, C. Wang, M. Li, K. Ren, and W. Lou, “Privacy-preserving multi-keyword ranked search
over encrypted cloud data,” in IEEE INFOCOM, April 2011, pp. 829–837.
[18] B. Wang, S. Yu, W. Lou, and Y. T. Hou, “Privacy-preserving multikeyword fuzzy search over
encrypted data in the cloud,” in IEEE INFOCOM, 2014.
[19] C. Orencik, M. Kantarcioglu, and E. Savas, “A practical and secure multi-keyword search method
over encrypted cloud data,” in Cloud Computing (CLOUD), 2013 IEEE Sixth International Conference
on. IEEE, 2013, pp. 390–397.

Neeshima P.P. , Pavitra Shankar Hegde, Poojashree P. and Pallavi G.B. 61


A MULTI KEYWORD RANKED SEARCH TECHNIQUE WITH PROVISION FOR DYNAMIC UPDATE OF ENCRYPTED
DOCUMENTS IN CLOUD

[20] J. Katz, A. Sahai, and B. Waters, “Predicate encryption supporting disjunctions, polynomial
equations, and inner products,” in Advances in Cryptology–EUROCRYPT 2008.Springer, 2008, pp.
146–162.
[21] Zhihua Xia, Xinhui Wang, Xingming Sun and Qian Wang, “A secure and dynamic multi-keyword
ranked search scheme over encrypted cloud data” in IEEE 2015.

62

You might also like