Verify & Decrypt Multi-Keyword Ranked Search Over Encrypted Data On Cloud Server
Verify & Decrypt Multi-Keyword Ranked Search Over Encrypted Data On Cloud Server
Verify & Decrypt Multi-Keyword Ranked Search Over Encrypted Data On Cloud Server
ABSTRACT
Cloud Computing is defined as a computing machine or group of machine which as a server/s, holds all the applications and
document necessary all at one place so that any user can access it from anywhere whoever has a access to that server without
actually using his/her physical machine.Any individual user who has permission to access the server can use the
server'sprocessing power to run an application, store data, or perform any other computing task. Therefore, instead of using
apersonal computer every-time to run the application, the individual can now run the application from anywhere in the
world.For data privacy, data which is traverses using cloud computing need to be encrypted, which makes data utilization a
very challenging task.In this paper, we evaluate and solve the challenging problem of privacy preserving multi-keyword ranked
search over encrypted cloud data (MRSE). Among various multi-keyword techniques we choose the efficient similarity measure
of coordinate matching, i.e., as many matches as possible, to capture the relevance of data documents to the search query.
We further use inner product similarity to quantitatively evaluate such similarity measure. We first propose a basic idea for
the MRSE based on secure inner product computation, and then give two significantly improved MRSE schemes to achieve
various stringent privacy requirements in two different threat models.
Keywords:- Cloud Computing, MRSE, Multi-keyword ranked Search, coordinate matching, inner product similarity,
privacy preserving, Confidential Data.
1.INTRODUCTION
Cloud computing involves highly available massive compute and storage platforms offering a wide rangeof services. It
enables convenient, on-demand network access to centralized resources that can be rapidly brings into effective action
with great efficiency and minimum management overhead. The advantages of Cloud Computinginclude: On-demand
self-service, easy network access, pooling, frequent resource elasticity, usage-based pricing. AsCloud Computing
becomes predominant, more easily offended information is being centralized into the cloud. So, thefact that data owners
and cloud server are not in the equal trusted domain may put the sourced data at risk. Thus, dataencryption makes
effective data utilization a very challenging task given that there could be a large amount ofoutsourced data files. Cloud
computing is called as a utility computing since it uses pay per use paradigm. Users have to pay for the usages. With
the technology of cloud computing, users can access a variety of resources like programs, storage and application
development platforms. Cloud is the extension of object oriented programming and it uses the concept of abstraction.
Cloud computing is an emerging technology which helps as an utility, through which clients are going to store their
data in the cloud server and using applications from a set of computing resources[1]. Here sensitive data is going to be
centralized in the server. In some times the cloud server may leaks the data to hackers [2]. The data is going to
encrypted before outsourced to achieve privacy. The encryption techniques increase the data utilization from a large
amount of data. To retrieve data files we introduced keyword search mechanism. By this mechanism the users are going
to retrieve the data files of their interest. In traditional search, encryption techniques the users are going to search data
by using keywords without decrypting it, they support only Boolean keyword search only [2][10]. In cloud computing
graded keyword search enhances the system usability by displaying the matching files by the help of relevance score. To
achieve security and usability we introduce advanced cryptographic and information retrieval techniques, and using
one-to-many order preserving symmetric encryption [3].
Basically there are three types of public cloud Services:
Infrastructure as a service (IaaS): In this most basic cloud service model, IaaS providers offer computers, as
physical or more often as virtual machines, and other resources.
Platform as a service (PaaS): In the PaaS model, cloud providers deliver a computing platform typically including
operating system, programming language execution environment, database, and web server. Application
developers can develop and run their software solutions on a cloud platform without the cost and complexity of
buying and managing the underlying hardware and software layers.
Software as a service (SaaS): In the SaaS model, cloud providers install and operate application software in the
cloud and cloud users access the software from cloud clients. The cloud users do not manage the cloud
infrastructure and platform on which the application is running. This eliminates the need to install and run the
application on the cloud user's own computers simplifying maintenance and support. What makes a cloud
application different from other applications is its scalability.
Page 192
Page 193
to quantitatively evaluate such similarity measure of that document to the search query. During the index construction,
each document is associated with a binary vector as a sub-index where each bit represents whether corresponding
keyword is contained in the document. The search query is also described as a binary vector where each bit means
whether corresponding keyword appears in this search request, so the similarity could be exactly measured by the inner
product of the query vector with the data vector. However, directly outsourcing the data vector or the query vector will
violate the index privacy or the search privacy. To meet the challenge of supporting such multi-keyword semantic
without privacy breaches, we propose a basic idea for the MRSE using secure inner product computation, which is
adapted from a secure k-nearest neighbor (kNN) technique [23], and then give two significantly improved MRSE
schemes in a step-by-step manner to achieve various stringent privacy requirements in two threat models with increased
attack capabilities. Our contributions are summarized as follows,
1. For the first time, we explore the problem of multi-keywordranked search over encrypted cloud data, andestablish a
set of strict privacy requirements for such a secure cloud data utilization system.
2. We propose two MRSE schemes based on the similaritymeasure of coordinate matching while meeting
differentprivacy requirements in two different threat models.
3. Thorough analysis investigating privacy and efficiencyguarantees of the proposed schemes is given, and
experimentson the real-world dataset further show theproposed schemes indeed introduce low overhead
oncomputation and communication.
The remainder of this paper is organized as follows. InSection II, we introduce the system model. Section III describes
the MRSE framework and privacy requirements, followed by section IV, which describes the proposed schemes, and
conclude the paper in Section V.
3.SYSTEM MODEL
Considering a cloud data hosting service involving three different entities, as illustrated in Fig. 1: the data owner, the
data user, and the cloud server. The data owner has a collection of data documents F to be outsourced to the cloud
server in the encrypted form C. To enable the searching capability over C for effective data utilization, the data owner,
before outsourcing, will first build an encrypted searchable index I from F, and then outsource both the index I and the
encrypted document collection C to the cloud server.
3.MRSE FRAMEWORK
In this section, we define the framework of multi-keyword ranked search over encrypted cloud data (MRSE) and
establish various strict systemize privacy requirements for such a secure cloud data utilization System. The operations
on the data documents are not shown in the framework since the data owner could easily employ the traditional
symmetric key cryptography to encrypt and then outsource data. With focus on the index and query, the MRSE system
Page 194
consists of four algorithms as follows-Setup (1l) Taking a security parameter l as input, the data owner outputs a symmetric key as SK.
4.MRSE I SCHEME
In our more advanced design, instead of simply removing the extended dimension in the query vector as we plan to do
at the first glance, we preserve this dimension extending operation but assign a new random number t to the extended
dimension in each query vector. Such a newly added randomness is expected to increase the difficulty for the cloud
server to learn the relationship among the received trapdoors. In addition, as mentioned in the keyword privacy
requirement, randomness should also be carefully calibrated in the search result to obfuscate the document frequency
and diminish the chances for re-identification of keywords. Introducing some randomness in the final similarity score is
an effective way towards what we expect here. More specifically, unlike the randomness involved in the query vector,
Page 195
we insert a dummy keyword into each data vector and assign a random value to it. Each individual vector Di is
extended to (n+2)-dimension instead of (n + 1), where a random variable irepresenting the dummy keyword is stored
in the extended dimension. The whole scheme to achieve ranked search with multiple keywords over encrypted data is
as follows.
Setup The data owner randomly generates a (n + 2)-bit vector as S and two (n+2)(n+2) invertible matrices
{M1,M2}. The secret key SK is in the form of a 3-tuple as {S,M1,M2}.
BuildIndex(F, SK) The data owner generates a binary data vector Di for every document Fi, where each binary bit
Di[j] represents whether the corresponding keyword Wjappears in the document Fi. Subsequently, every plaintext
sub-index is generated by applying dimension extending and splitting procedures on Di. These procedures are
similar with those in the secure kNNcomputation except that the (n + 1)-th entry in isset to a random number i,
and the (n + 2)-th entry in is set to 1 during the dimension extending. is therefore equal to (Di, i, 1).
Trapdoor( ) With t keywords of interest in fW as input,one binary vector Q is generated where each bit Q[j]indicates
whether Wj is true or false. Q is firstextended to n + 1-dimension which is set to 1, andthen scaled by a random
number r = 0, and finallyextended to a (n + 2)-dimension vector as where thelast dimension is set to another
random number t. is therefore equal to (rQ, r, t). After applying the samesplitting and encrypting processes as
above, the trapdoor
Query( , k, I) With the trapdoor
, the cloud servercomputes the similarity scores of each document Fi. WLOG,
we assume r > 0. After sorting allscores, the cloud server returns the top-k ranked id list
.
5.CONCLUSION
In this paper, for the first time we define and solve the problem of multi-keyword ranked search over encrypted cloud
data, and establish a variety of privacy requirements. Among various multi-keyword semantics, we choose the efficient
similarity measure of coordinate matching, i.e., as many matches as possible, to effectively capture the relevance of
outsourced documents to the query keywords, and use inner product similarity to quantitatively evaluate such
similarity measure. For meeting the challenge of supporting multi-keyword semantic without privacy breaches, we
propose a basic idea of MRSE using secure inner product computation. Then we give two improved MRSE schemes to
achieve various stringent privacy requirements in two different threat models. Thorough analysis investigating privacy
and efficiency guarantees of proposedschemes is given, and experiments on the real world dataset show our proposed
schemes introduce low overhead on both computation and communication.
REFERENCES
[1] Ning Cao, Cong Wang, Ming Li, KuiRen, and Wenjing Lou, Privacy-Preserving Multi-keyword Ranked Search
over Encrypted Cloud Data IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS VOL:25
NO:1 YEAR 2014
[2] L. M. Vaquero, L. Rodero-Merino, J. Caceres, and M. Lindner, A break in the clouds: towards a cloud definition,
ACM SIGCOMM Comput.Commun.Rev., vol. 39, no. 1, pp. 5055, 2009.
[3] S. Kamara and K. Lauter, Cryptographic cloud storage, in RLCPS, January 2010, LNCS. Springer, Heidelberg.
[4] A. Singhal, Modern information retrieval: A brief overview, IEEE Data Engineering Bulletin, vol. 24, no. 4, pp.
3543, 2001.
[5] I. H. Witten, A. Moffat, and T. C. Bell, Managing gigabytes: Compressing and indexing documents and images,
Morgan Kaufmann Publishing, San Francisco, May 1999.
[6] D. Song, D. Wagner, and A. Perrig, Practical techniques for searches on encrypted data, in Proc. of S&P, 2000.
[7] E.-J. Goh, Secure indexes, Cryptology ePrint Archive, 2003, http://eprint.iacr.org/2003/216.
[8] Y.-C. Chang and M. Mitzenmacher, Privacy preserving keyword searches on remote encrypted data, in Proc. of
ACNS, 2005.
[9] R. Curtmola, J. A. Garay, S. Kamara, and R. Ostrovsky, Searchable symmetric encryption: improved definitions
and efficient constructions,in Proc. of ACM CCS, 2006.
[10] D. Boneh, G. D. Crescenzo, R. Ostrovsky, and G. Persiano, Public key encryption with keyword search, in Proc.
of EUROCRYPT, 2004.
[11] M. Bellare, A. Boldyreva, and A. ONeill, Deterministic and efficiently searchable encryption, in Proc. of
CRYPTO, 2007.
[12] M. Abdalla, M. Bellare, D. Catalano, E. Kiltz, T. Kohno, T. Lange, J. Malone-Lee, G. Neven, P. Paillier, and H.
Shi, Searchable encryption revisited: Consistency properties, relation to anonymous ibe, and extensions, J.
Cryptol., vol. 21, no. 3, pp. 350391, 2008.
[13] J. Li, Q. Wang, C. Wang, N. Cao, K. Ren, and W. Lou, Fuzzy keyword search over encrypted data in cloud
computing, in Proc. of IEEEINFOCOM10 Mini-Conference, San Diego, CA, USA, March 2010.
Page 196
[14] D. Boneh, E. Kushilevitz, R. Ostrovsky, and W. E. S. III, Public key encryption that allows pir queries, in Proc.
of CRYPTO, 2007.
[15] P. Golle, J. Staddon, and B. Waters, Secure conjunctive keyword search over encrypted data, in Proc. of ACNS,
2004, pp. 3145.
[16] L. Ballard, S. Kamara, and F. Monrose, Achieving efficient conjunctive keyword searches over encrypted data,
in Proc. of ICICS, 2005.
[17] D. Boneh and B. Waters, Conjunctive, subset, and range queries on encrypted data, in Proc. of TCC, 2007, pp.
535554.
[18] R. Brinkman, Searching in encrypted data, in University of Twente, PhD thesis, 2007.
[19] Y. Hwang and P. Lee, Public key encryption with conjunctive keyword search and its extension to a multi-user
system, in Pairing, 2007.
[20] J. Katz, A. Sahai, and B. Waters, Predicate encryption supportingdisjunctions, polynomial equations, and inner
products, in Proc. OfEUROCRYPT, 2008.
[21] A. Lewko, T. Okamoto, A. Sahai, K. Takashima, and B. Waters, Fully secure functional encryption: Attributebased encryption and (hierarchical) inner product encryption, in Proc. of EUROCRYPT, 2010. [21] E. Shen, E.
Shi, and B. Waters, Predicate privacy in encryption systems, in Proc. of TCC, 2009.
[22] C. Wang, N. Cao, J. Li, K. Ren, and W. Lou, Secure ranked keyword search over encrypted cloud data, in Proc.
of ICDCS10, 2010.
[23] W. K. Wong, D. W. Cheung, B. Kao, and N. Mamoulis, Secure knn computation on encrypted databases, in
Proceedings of the 35thSIGMOD international conference on Management of data, 2009, pp. 139152.
[24] S. Yu, C. Wang, K. Ren, and W. Lou, Achieving secure, scalable, and fine-grained data access control in cloud
computing, in Proc. OfINFOCOM, 2010.
[25] C. Wang, Q. Wang, K. Ren, and W. Lou, Privacy-preserving public auditing for data storage security in cloud
computing, in Proc. OfINFOCOM, 2010.
[26] S. Zerr, E. Demidova, D. Olmedilla, W. Nejdl, M. Winslett, and S. Mitra, Zerber: r-confidential indexing for
distributed documents, in Proc. OfEDBT, 2008, pp. 287298.
[27] S. Zerr, D. Olmedilla, W. Nejdl, and W. Siberski, Zerber+r: Top-k retrieval from a confidential index, in Proc. of
EDBT, 2009, pp. 439449.
[28] Y. Ishai, E. Kushilevitz, R. Ostrovsky, and A. Sahai, Cryptography from anonymity, in Proceedings of the 47th
Annual IEEE Symposiumon Foundations of Computer Science, 2006, pp. 239248.
[29] W. W. Cohen, Enron email dataset, http://www.cs.cmu.edu/ enron/
Page 197